You may have the fastest computers, the most scalable server farm, and the best hand-optimized code with the smartest algorithms that computer science can create. However, without data all of it is for naught. Imagine a 3D game engine with no models to render, or a web search engine with no web to search. Pretty boring and useless, right? While databases may not sound like they are the most exciting branch of computing tech, they are arguably the most important and useful.
From the perspective of a developer, programs can be written that store information in large structures in the program memory itself. In fact, this is how introductory programs at the university level are written until file system IO and database integration are covered. Obviously, when data is stored in a live program, the data has the same lifespan as the program. Though most small programs can release their data without hassle, almost any useful program needs persistence.
Relational databases are powerful for their ability to store data with well-defined, structured relations and use queries to present and join the data according to relational algebra. They also make what could be called the “ACID promise” – atomicity, consistency, isolation, and durability.
Much of modern computing tech has its origin in the big, formalized, buttoned-down world of companies such as IBM. The SQL language was developed for IBM's in-house relational database in the 1970s and was commercialized by the company that would become Oracle in 1979. Big companies tend toward order, control, and quality assurance. Databases meant for these environments tend to represent an aversion to the risk of losing even one byte of data, or of executing a query that does not always return the expected results. The reliability of the database trumps everything – even performance.
Relational databases, usually driven by SQL (but we should mention that RDBMS and SQL are NOT the same thing) have served industry and the web fairly well for decades. However, the sheer volume of data that is flowing through our modern systems – through the web, through our mobile devices, through Twitter, etc. – is urging on the adoption of a looser, faster database model.
This is where NoSQL comes into play. The name isn’t meant to be a dig on SQL (in fact it is sometimes referred to as “Not only SQL” since some NoSQL databases allow SQL queries). It could more accurately be called “NoREL” – no relational model. While RDBM systems provide flexibility and reliability, the truth is that many applications do not need a relational model, and as scary as it may seem they do not need “ACID” guarantees either. The capabilities offered by NoSQL systems tend to be store and retrieve only, but the gains made in terms of performance and potential data volume are enormous.
MongoDB is an open source leader in the NoSQL family. It officially went production-ready in 2010 and has since been adopted by web companies such as Craigslist and Foursquare. It is well-suited for a range of problem domains, including mobile data and high volume analytics. Follow us for an more on the exciting promises of MongoDB and other NoSQL systems.