database.md

Use case	Solution	Example	Notes
Caching	Key Value pair for a query pattern/ key which needs a cached output	Redis, Memcache, etcd
File storage for Images, Videos	Files would be serviced directly and would not require any query to be executed on the files.	Blob Storage ( eg. Amazon S3)	CDN can be used with the Blob to provide data faster
Text Search on Unstructured data	Data can be unstructured like Labels, Logs, etc. for this, the data may not be relational. We use Document DB for this scenario.	MongoDB
Text Search on Unstructured data with ever-increasing data	Unstructured data but the data is increasing rapidly. We use Columnar DB for this scenario	Cassandra
Metrics tracking	Metrics used in Grafana, for example, use a Relational DB in an APPEND-ONLY mode, which is a DB which can not update any item in the middle of the DB, but only add to the DB	Influx DB, Graphite, OpenTSDB(timeseriesdb), etc	VictoriaMetircs uses multiple DS and is a great tool for Metrics
Analytics for a large dataset	Data warehouse/ data lake can be used to provide support to analyze such large amount of data. The data used here is for reporting and NOT for a transactional system, meaning we do not intend to update the data frequently	Hadoop
Multi-disciplinary course	In this use case, a student can start from Course A-1. But they can move to A-2 or B1. Similarly, if they choose B1, they can move to B2-a or B2-b. The data is related, but the flow can have multiple branches. Graph DB is a special kind of DB which can be used for this scenario	Neo4j
Search Engine DB	In this case, the data can be both Structured and unstructured, however , the lookup time for text search query can be large in both. A special DB type , which is used for Search Engines, can be used for quick key search.	ElasticSearch	These can be used for quick data search. We can implement Fuzzy Search on this DB which can be used to identify wrong spellings or similar spellings.
Text Search on Structured data	Data will be structured, so we will have a relation between multiple data sources	RDBMS	Atomicity, Consistency, Isolation and Durability need to be ensured.We can use PostGre/MySQL for most scenarios but if we need to scale horizontally repeatedely( adding more nodes ), MariaDB may be beneficial

reedip/database.md

reedip commented Feb 23, 2022

Uh oh!