Sharding or Data Partitioning:
– Sharding is a technique to break up a big database into many smaller parts
– Horizontal scaling means adding more machines, which is cheaper & more feasible
– Vertical scaling means improving servers
– Partitioning Methods:
– Horizontal partitioning:
– In this we put different rows into different table(db), i.e. rows can be based on location with zip codes, this is also range based sharding
– Problem is if range is not choosen carefully, it’ll lead to unbalanced servers.
– Vertical Partitioning:
– In this we divide data to store tables related to a specfic feature to their own servers.i.e. In Instagram, we can have 1 for user profile, 1 for photos, 1 for friend list.
– Problem is in case of additional growth, its necessary to further partition a feature specific DB across servers.
– Directory based Partitioning:
– In this, create a lookup service which knows your current partitioning scheme & get it from DB access code.
– To find out a data entry, we query directory server that holds the mapping between each tuple key to its DB server.
– This is good to perform tasks like adding servers to the DB pool or change our partitioning scheme without impacting application.
– Partitioning Criteria:
– Hash-based partitioning:
– In this we apply a hash function to some key attribute of the entity we’re storing, that gives partition number.
– Make sure to ensure uniform allocation of data among servers
– Problem is it effectively fixes total number of DB servers, since adding a new server means changing the hash function, which would require redistribution of data & downtime, the workaround is Consistent Hashing.
– List Partitioning:
– In this each partition (DB server) is assigned a list of values, i.e. APAC, EMEA, US region has respective partition.
– Round-robin Partitioning:
– One by one assign which ensure uniform data distribution
– Composite partitioning:
– Combining any of above partitioning schemas to devise a new scheme. i.e First list partitioning with hash partitioning in each.
– Common problems with Sharding:
– Joins & Denormalisation:
– Performing joins on a database which is running on one server is straightforward, but if DB is partitioned & spread across multiple machines, it’s not feasible to perform joins that span database shards.
– These joins won’t be performance efficient.
– Workaround is to denormalise the DB so that queries that required joins can be performed on single table.
– Referential integrity:
– Trying to enforce data integrity constraint suh as foreign keys in a sharded DB can be extremely difficult. Most of RDMS do not support this.
– Rebalancing:
– When data distribution is not uniform.
– When there is lot of load on a particular Shard.
In this video, we’re going to reveal Sharding used in System Design:
– What is Sharding/ Data Partitioning
– Sharding Methods
– Sharding Criteria
– Sharding Challenges
★☆★ SUBSCRIBE TO ME ON YOUTUBE: ★☆★
https://www.youtube.com/codingsimplified?sub_confirmation=1
what is sharding in mongodb,
database sharding example
sharding vs partitioning,
sharding mysql,
sharding techniques,
consistent hashing sharding,
sharding crypto,
what are sharding criteria
CHECK OUT CODING SIMPLIFIED
https://www.youtube.com/codingsimplified
I started my YouTube channel, Coding Simplified, during Dec of 2015.
Since then, I’ve published over 200+ videos. My account is Partner Verified and I get my earnings direct deposited into my account every month.
★☆★ VIEW THE BLOG POST: ★☆★
http://thecodingsimplified.com
★☆★ SEND EMAIL At: ★☆★
Email: thecodingsimplified@gmail.com


Comments