Friday, March 25, 2022

SMS: Keys and Indexes

Note: "Setting Matters Straight" is a new format: I post on LinkedIn an online Q&A involving data fundamentals to encourage readers to test their foundation knowledge, which they can then compare with our debunking here, where we confirm what is correct and correct what is fallacious (with clarifications, wherever necessary). For in-depth treatment check out the POSTS and our PAPERS, LINKS and BOOKS (or organize one of our on-site/online SEMINARS, which can be customized to specific needs).


Q: “What is the difference between a primary key, a unique key, and an index in databases?”

A: “Unique key is a field (or fields) with a set of unique values; the uniqueness is usually enforced with UNIQUE constraint. There might be one or more per table. Every PRIMARY key is always a unique key; there should be only one per table. It uniquely identifies record, and is used to enforce integrity - entity integrity, and, in tandem with FOREIGN key, referential integrity. Index is a data structure to facilitate records search. It might be created on PRIMARY key (best practice), unique key or any other field or combination thereof in the table. The limit on how many indices a table might have is defined in RDBMS implementation. An index might - or might not - speed up some queries.”

A: “The primary key is inherently indexed and unique and is the cross reference to related tables. Often the best primary key is an auto number integer as any value entered by humans is subject to error or delay that can be challenging to manage in the user interface ... whereas an auto number is assigned immediately upfront and eliminates any possible record conflict in tables during multi user entries. A unique key is somewhat of an informal definition. My view is that it is a definition of a field that is not being used as the primary key, but is unique unlike i.e. Last Name -- for instance a social security number field. So it is not the primary key as it is not the field/value being used to cross reference to related tables but it is unique in the table.”

A: “A primary key is a unique, non null value which can identify every tuple (row in the table) uniquely. A unique key/column/constraint ensures that no two rows contain the same value (almost the same as primary key). Unless specified explicitly for the column configuration, a NULL is a valid value for column with unique constraint. A index can be thought of as the appendix at the end of the book. The information is sorted in specific order so that look up is easy and it points to the location that is being searched for.”
--Quora.com

Friday, March 18, 2022

SMS: Entities and Records

 

Note: "Setting Matters Straight" is a new format: I post on LinkedIn an online Q&A involving data fundamentals to encourage readers to test their foundation knowledge, which they can then compare with our debunking here, where we confirm what is correct and correct what is fallacious (with clarifications, wherever necessary). For in-depth treatment check out the POSTS and our PAPERS, LINKS and BOOKS (or organize one of our on-site/online SEMINARS, which can be customized to specific needs).

Q: “What is the relationship between an entity and a record?”

A: “In the context of a database design, an ‘entity’ is a type or category of persons, places, things or events. It’s a collectivisation of the nouns in a system about which you wish to keep data. For example, Employee might be the name of an entity in your system. A ‘record’ is a collection of data about a specific entity, a particular person or place, an identifiable thing, or a single event. For example, Name: ‘Dave Voorhis’, StartYear: 2019, Salary: £1,398,293 might be a record of one Employee entity in your system.”

A: “Database, file, and recordset are basically the same thing. They are collections of information or data. Each database or file or recordset typically has some sort of common purpose or definition. Like a database (relational, hierarchical, etc.) of data of a business process. A File is again a collection of data such as all transactions to be posted. A recordset is also basically a file.

Entity and table are basically the same thing. While you have the grouping of all the data, and entity (logical view) and a table (physical view) are the same. As Dave said, it is a logical grouping of a specific piece of data.

File, recordset, record, row or line are basically the same. A .csv file is a grouping of records. A file is a grouping of records. A row is an individual grouping of data from a relational database.

The last is element or attribute or field. This is the individual piece of data like Transaction_Amount or First Name.”
--Quora.com
A simple and the answer oversimplifies. But things seem simple only in the absence of foundation knowledge. Practitioners use different terms for the same thing, or the same word for different things, but that must be corrected, not accepted or validated.

Sunday, March 6, 2022

TYFK: Relationships -- Uniqueness and Attribute Constraints

Note: Each "Test Your Foundation Knowledge" post presents one or more misconceptions about data fundamentals. To test your knowledge, first try to detect them, then proceed to read our debunking, reflecting the current understanding of the RDM, distinct from whatever has passed for it in the industry to date. If there isn't a match, you can review references -- reflecting the current understanding of the RDM, distinct from whatever has passed for it in the industry to date -- which explain and correct the misconceptions. You can acquire further knowledge by checking out our POSTS, BOOKS, PAPERS, LINKS (or, better, organize one of our on-site SEMINARS, which can be customized to specific needs).

 

“A unique constraint is a type of column restriction within a table, which dictates that all values in that column must be unique [and] allows null values ... a null is the complete absence of a value (not a zero or space). Thus, it is not possible to say that the value in that null field is not unique, as nothing is stored in that field.”
--Techopedia
This is one of my recent "What's Wrong with this database picture" posts on LinkedIn.

Misconceptions

In the RDM a uniqueness constraint:

  • Should not be viewed solely as a "column restriction within a table'.
  • Does NOT allow SQL "NULLs" (not "NULL values"), which have nothing to do with storage.

View My Stats