Federations are individual data partitions, which have their individual scheme centrally managed by a single distribution (or federation) scheme. That scheme defines and controls the single keying mechanism for cross-partition distribution. While a handful of data types are acceptable for the distribution key, I like the simplicity of a bigint as defining key (of any kind actually).
The individual partitions are members of the federation, each with their own schema. As such, they are responsible for any inclusive subset of the values in a federated table covered by the data type of the federation distribution key. The individual can be responsible for all of the values or a range of the values giving the architecture the ability to scale dynamically to match the current need. While each partition has its own schema, the table keys correspond to the federation scheme. A federation member may also contain tables that are not part of the federation, known as reference tables. Reference tables can be including in results that along with federation aware data. It is important to note that each partition controls its own schema. As such, it may or may not match the schema of other member partitions.
When building a federation plan, a paramount decision to make is deciding value upon which value to federate. I think the best practice may be to use a value that is meaningful to the data separation you are trying to achieve. In my world, the thing that makes the most sense is the customer or tenant identifier. This gives us the ability to centrally reference all data for querying, yet provide each customer with what amounts to a singularly responsible and sovereign data set.
While sharding is a great solution for these types of application, it is important to understand the complexity that accompanies the sharding process. Depending on the individual implementation flavor, sharding may developers handle rollbacks, constraints, and referential integrity across tables when historically those items have been handled by the database itself. It also makes joins, global searches and other high-level insight more difficult. Even knowing the trade-offs being made for the ability to scale data, it is hard to argue a properly executed sharding strategy for serving multi-tenant data in a web-mobile application world. The process checks all of the boxes required by the various user stories and operational concerns.
Sharding is a good example of a core belief of mine; It really should not matter how difficult or easy, how fancy or how simple a given technique or design is. The right answer should be the right answer. You should not over-design because one thing seems too simple, nor should you under-design because it seems too hard. The entirety of the platform truth should become self-evident and then pursued as the goal.