Friday, October 4, 2019

Test Your Foundation Knowledge

The Web is chockful of unnoticed/unquestioned pronouncements by novices or "experts", many self-taught, that are (1) wrong, or (2) gobbledygook. Any attempt to demonstrate lack of foundation knowledge underlying these misconceptions and their practical implications are usually dismissed as "theory, not practical", attacked as "insulting ad-hominem", or ignored altogether, regardless of the amount and quality of the supporting evidence and argument logic. This is understandable: in the absence of foundation knowledge and ability to reason, it is by definition impossible to comprehend and appreciate corrections that require them.

Practitioners who cannot detect such misconceptions and understand their practical implications and the importance thereof are insufficiently prepared for a professional career in data management. Worse, they cannot associate problems with their real causes and, thus, cannot come up with proper solutions, which explains the industry's "cookbook approach" and succession of fads.

What about you? This is another batch in the Test Your Foundation Knowledge regular series of posts of online statements reflecting common misconceptions that are difficult to discern without foundation knowledge. You can test yours by trying to debunk them in Comments, including what category, (1) or (2) do they fall in? If you can't, proper education is in order.



Up to 2018, DBDebunk was maintained and kept free with the proceeds from my @AllAnalitics column. In 2018 that website was discontinued. The content of this site is not available anywhere else, so if you deem it useful, particularly if you are a regular reader, please help upkeep it by purchasing publications, or donating. Thank you.


  • 08/09/19: Following my series of posts on data sublanguage (Parts 1-4), I have revised for consistency the corresponding section of paper #2 in the Understanding the Real RDM series, Logical Access, Data Sublanguage, Kinds of Relations, and Database Redundancy and Consistency, which is available for ordering from the PAPERS page.



  • To work around Blogger limitations, the labels are mostly abbreviations or acronyms of the terms listed on the FUNDAMENTALS page. For detailed instructions on how to understand and use the labels in conjunction with the that page, see the ABOUT page. The 2017 and 2016 posts, including earlier posts rewritten in 2017 were relabeled accordingly. As other older posts are rewritten, they will also be relabeled. For all other older posts use Blogger search. 
  • Following the discontinuation of AllAnalytics, the links to my columns there no longer work. I moved the 2017 columns to dbdebunk and, time permitting, may gradually move all of them. Within the columns, only the links to sources external to AllAnalytics may work.


I deleted my Facebook account. You can follow me:

  • @DBDdebunk on Twitter: will link to new posts to this site, as well as To Laugh or Cry? and What's Wrong with This Picture posts, and my exchanges on LinkedIn.
  • @The PostWest blog: Evidence for Antisemitism/AntiZionism – the only universally acceptable hatred – as the (traditional) response to the existential crisis of decadence and decline of Western (including the US)
  • @ThePostWest Twitter page where I comment on global #Antisemitism/#AntiZionism and the Arab-Israeli conflict.

“Relational Databases such as MySql, Postgres, Oracle, etc couldn’t scale well. SQL databases are not designed for horizontal scaling. Joining dataset & data aggregation from many machines introduces complexity in our design.”
“Relational Databases have a rigid schema. Users have to go through many iterations to model data. Altering the data type of an attribute becomes a nightmare for developers, leads, and DBAs. NoSQL overcomes this limitation by providing a flexible schema. These databases abstract out the data storage and internal working from the users. They provide support to store user-defined data structures. For eg: data can be stored in the form of a JSON object. Users have the flexibility to add, replace or remove attributes from the data.”
“SQL and NoSQL are analogous to statically typed and dynamically typed programming languages. SQL databases are like C, C++ where you define the data first and later store values. NoSQL offers python like capabilities where you assign any value to a variable, & it works.
  • SQL is analogous to C, C++, Java
  • NoSQL is analogous to Python”
“Being Schema agnostic, NoSQL databases are also termed as schema-on-read databases. You only need to know how the data is stored while reading the data.”
“Flexible schema shortens the development time. You no longer have to go through many iterations of data modelling & design. Developers can store & retrieve whatever they want. The only downside of the schema-less design is that it increases the risk as there is a lack of control. It’s only a threat if a developer modifies a production system bypassing the development process.”
“SQL databases support null values for columns. For eg:- A bank application webpage has many optional fields like street name, nickname, etc. If the users don’t populate optional fields, the database will still reserve space for these columns in case the users update them in future. In the NoSQL database, you don’t pass the null entries and storage is hence optimised.”
“Schema-less doesn’t imply any random garbage can be stored in the database. For example, if a database column supports JSON data type, the JSON must be well formatted. The application will get an error if it tries to store a malformed JSON object.”
“Relational databases organise data into rows and columns. You can store data in many tables and the tables can have different relationships. To fetch the data, you can join the tables on the value of an attribute. The application performance degrades when the number of tables to be joined goes in double digits or higher. There is a significant drop in speed in case the application joins tables stored on different database servers.”
“NoSQL databases are denormalised. There is no concept of the relationship between records in NoSQL databases. This means instead of you only store the aggregate data in a single table instead of scattering it across different tables. For instance, when you are designing a food delivery app using RDBMS, you’ll create multiple tables- one for users, restaurant, orders. In a NoSQL database, a single orders table can have a restaurant, user data duplicated across many rows. The downside of data duplication is overcome by the above-mentioned benefits.”
“You can avoid creating complex ER diagrams and writing complicated SQL queries. With NoSQL databases, you can speed up your development and focus on getting things done.”
                                         --NoSQL databases — An Introduction

No comments:

Post a Comment