NoSQL is a rapidly evolving market with products undergoing constant change. Having so many NoSQL databases available is a double-edged sword. With so many differences out there, common misconceptions form and become lore.
NoSQL is a single type of database
NoSQL is a catch-all term for a variety of database types that exhibit common architectural approaches. These databases aren't intended for related table, rows, and columns data. They are highly distributed, which means data is spread across several servers, and they're tolerant of data structure changes (that is, they're schema agnostic).
You can find several types of databases under the NoSQL banner:
• Key-value stores provide easy and fast storage of simple data through use of a key.
• Column stores provide support for very wide tables but not for relationships between tables.
• Document stores support JSON and/or XML hierarchical structures.
• Triple (and graph) stores provide the same flexibility to relationships that document NoSQL databases provide to record structures.
NoSQL databases aren't ACID-compliant
ACID compliance is the gold standard of data safety. By ensuring that operations are atomic, views of data are consistent, operations don't interfere with each other, and data is durably saved to disk, you protect your data. People often think NoSQL databases do not provide ACID compliance.
ACID (atomicity, consistency, isolation, durability) is a set of properties of database transactions intended to guarantee data validity despite errors, power failures, and other mishaps. - Wikipedia
Many NoSQL databases provide full ACID support across clusters. MarkLogic Server, OrientDB, Aerospike, and Hypertable are all fully ACID-compliant, providing either fully serializable or read-commit ACID compliance.
Many other NoSQL databases can provide ACID-like consistency by using sensible settings in client code. This typically involves a Quorum or All setting for both read and write operations. These databases include Riak, MongoDB, and Microsoft DocumentDB.
NoSQL databases lose data
This misconception occurs when NoSQL databases are used incorrectly or when less mature products are used. Some NoSQL products are less mature, having only been around for fewer than five years, so they haven't developed data loss prevention features yet.
The guarantee of durability in ACID compliance is vital for enterprise systems, and ACID-compliant NoSQL databases provide this guarantee. Therefore, you're assured that no data is lost once the database confirms the data is saved.
Furthermore, eventually consistent databases can also provide data durability by careful use of a write ahead logging (WAL). Many NoSQL databases provide this capability.
NoSQL databases aren't ready for mission-critical enterprise applications
On the contrary, many organizations are using NoSQL databases for mission-critical workloads, including the following:
• Defense and intelligence agencies storing and sharing information
• Media companies storing all their digital assets for publication and purchasing in NoSQL databases
• Media companies providing searchable metadata catalogs for their video and audio media
• Banks using NoSQL databases as primary trade stores or back office anti-fraud and risk-assessment systems
• Government agencies using NoSQL databases as the primary back ends for their health care systems
These are not small systems or simple caches for relational systems. They are cases for which NoSQL is well suited. Of course, some NoSQL databases are more ready for enterprise systems than others.
NoSQL databases aren't secure
Not so! Many NoSQL databases now provide record-level and even data-item-level (cell) security. Microsoft DocumentDB, MarkLogic Server, OrientDB, AllegroGraph, and Accumulo all provide fine-grained role-based access control (RBAC) to access records stored within these NoSQL databases.
Many NoSQL databases provide integration to existing Lightweight Directory Access Protocol (LDAP), Kerberos, and certificate-based security systems. Support for encryption over the wire in all client-to-server communications, and internode communications within a cluster, is also provided by these databases.
Some NoSQL databases are even accredited and used by defense organizations. Accumulo came from a National Security Agency (NSA) project. MarkLogic Server is independently accredited under the U.S. Department of Defense's (DoD) Common Criteria certification.
Not all NoSQL databases provide this functionality, though the majority of them probably will in the future. For now, you have choices that enable you to secure information.
All NoSQL databases are open-source
There are numerous open-source databases in the NoSQL world. Many commercial companies have attempted to replicate Red Hat's success by offering a subset of their products' capabilities to be used for free under an open-source license.
Many of these companies' platforms don't support open standards, though. Also, most of the code is contributed by those companies. Limited features are provided in the base version by these “open-source” companies.
There are many fully commercial companies in the NoSQL space. Microsoft, MarkLogic, Franz (Allegrograph), Hypertable, and Aerospike are all great commercial companies offering NoSQL databases, and they're being very successful doing so.
NoSQL databases are only for Web 2.0 applications
Their use in new web and mobile application stacks have made NoSQL databases popular. They're easy to use from the start, and many operate under a for-free license agreement, making them attractive to startups.
Social media applications commonly use NoSQL databases. Social media applications bring in web published data and aggregate it together in order to discover valuable information.
The vast majority of use cases, though, aren't Web 2.0-type applications. They're the same applications that have been around a long time, but where relational databases no longer provide an adequate solution. This includes scenarios where the data being stored is very sparse, with many blank (null) values, or where there is frequent change over time of the structure of the information being stored.
NoSQL is just hype
Microsoft, Oracle, and IBM each have their own NoSQL database on the market right now. Although susceptible to bluster, these companies invest in technology only when they see a profit.
Established players like MarkLogic with years on the market have also proved that NoSQL technology isn't just hype and is valuable to a range of real-world customers across industries in mission-critical systems.
NoSQL developers don't understand how to use an RDBMS
There is a common misconception (by evil relational database application developers; you know who you are!) that NoSQL is used because developers don't have a grasp on the fundamentals needed to configure relational databases so that they perform well.
This is completely incorrect. NoSQL comprises a range of approaches brought together to answer fundamentally different data problems than a relational database management system (RDBMS) solves.
If you're comparing an RDBMS to a NoSQL database, then you're comparing apples to motorbikes! NoSQL databases will not replace RDBMS. They are intended for data that's structured fundamentally different, as well as for different data problems.
Updated RDBMS technology will remove the need for NoSQL
Many of the highly distributed approaches of NoSQL are being blended with RDBMS technology, which has resulted in the emergence of many NewSQL databases.
Although NewSQL is helping to deal with NoSQL developers' criticisms of RDBMS technology, NewSQL is organized around the same data structures as an RDBMS is.
NoSQL databases are for different data problems, with different data structures and use cases.
About the Book Author
Adam Fowler is a principal sales engineer with MarkLogic, Inc. He has previously worked for IPK, FileNet, and IBM as well as smaller companies. Adam writes for and runs a popular blog on NoSQL and big data, which is republished on DZone.com. He's a frequent speaker at NoSQL conferences.
Get up to speed on the nuances of NoSQL databases and what they mean for your organization
This easy to read guide to NoSQL databases provides the type of no-nonsense overview and analysis that you need to learn, including what NoSQL is and which database is right for you. Featuring specific evaluation criteria for NoSQL databases, along with a look into the pros and cons of the most popular options, NoSQL For Dummies provides the fastest and easiest way to dive into the details of this incredible technology. You'll gain an understanding of how to use NoSQL databases for mission-critical enterprise architectures and projects, and real-world examples reinforce the primary points to create an action-oriented resource for IT pros.
If you're planning a big data project or platform, you probably already know you need to select a NoSQL database to complete your architecture. But with options flooding the market and updates and add-ons coming at a rapid pace, determining what you require now, and in the future, can be a tall task. This is where NoSQL For Dummies comes in!
• Learn the basic tenets of NoSQL databases and why they have come to the forefront as data has outpaced the capabilities of relational databases
• Discover major players among NoSQL databases, including Cassandra, MongoDB, MarkLogic, Neo4J, and others
• Get an in-depth look at the benefits and disadvantages of the wide variety of NoSQL database options
• Explore the needs of your organization as they relate to the capabilities of specific NoSQL databases
Big data and Hadoop get all the attention, but when it comes down to it, NoSQL databases are the engines that power many big data analytics initiatives. With NoSQL For Dummies, you'll go beyond relational databases to ramp up your enterprise's data architecture in no time.