Friday, March 25, 2022

SMS: Keys and Indexes



Note: "Setting Matters Straight" is a new format: I post on LinkedIn an online Q&A involving data fundamentals to encourage readers to test their foundation knowledge, which they can then compare with our debunking here, where we confirm what is correct and correct what is fallacious (with clarifications, wherever necessary). For in-depth treatment check out the POSTS and our PAPERS, LINKS and BOOKS (or organize one of our on-site/online SEMINARS, which can be customized to specific needs).


Q: “What is the difference between a primary key, a unique key, and an index in databases?”

A: “Unique key is a field (or fields) with a set of unique values; the uniqueness is usually enforced with UNIQUE constraint. There might be one or more per table. Every PRIMARY key is always a unique key; there should be only one per table. It uniquely identifies record, and is used to enforce integrity - entity integrity, and, in tandem with FOREIGN key, referential integrity. Index is a data structure to facilitate records search. It might be created on PRIMARY key (best practice), unique key or any other field or combination thereof in the table. The limit on how many indices a table might have is defined in RDBMS implementation. An index might - or might not - speed up some queries.”

A: “The primary key is inherently indexed and unique and is the cross reference to related tables. Often the best primary key is an auto number integer as any value entered by humans is subject to error or delay that can be challenging to manage in the user interface ... whereas an auto number is assigned immediately upfront and eliminates any possible record conflict in tables during multi user entries. A unique key is somewhat of an informal definition. My view is that it is a definition of a field that is not being used as the primary key, but is unique unlike i.e. Last Name -- for instance a social security number field. So it is not the primary key as it is not the field/value being used to cross reference to related tables but it is unique in the table.”

A: “A primary key is a unique, non null value which can identify every tuple (row in the table) uniquely. A unique key/column/constraint ensures that no two rows contain the same value (almost the same as primary key). Unless specified explicitly for the column configuration, a NULL is a valid value for column with unique constraint. A index can be thought of as the appendix at the end of the book. The information is sorted in specific order so that look up is easy and it points to the location that is being searched for.”
--Quora.com

------------------------------------------------------------------------------------------------------------------

SUPPORT THIS SITE
DBDebunk was maintained and kept free with the proceeds from my @AllAnalitics column. The site was discontinued in 2018. The content here is not available anywhere else, so if you deem it useful, particularly if you are a regular reader, please help upkeep it by purchasing publications, or donating. On-site seminars and consulting are available.Thank you.

LATEST POSTS

03/18 SMS: Entities and Records

03/06 TYFK: Relationships -- Uniqueness and Attribute Constraints

LATEST PUBLICATIONS (order from PAPERS and BOOKS pages)
- 08/19 Logical Symmetric Access, Data Sub-language, Kinds of Relations, Database Redundancy and Consistency, paper #2 in the new UNDERSTANDING THE REAL RDM series.
- 02/18 The Key to Relational Keys: A New Understanding, a new edition of paper #4 in the PRACTICAL DATABASE FOUNDATIONS series.
- 04/17 Interpretation and Representation of Database Relations, paper #1 in the new UNDERSTANDING THE REAL RDM series.
- 10/16 THE DBDEBUNK GUIDE TO MISCONCEPTIONS ABOUT DATA FUNDAMENTALS, my latest book (reviewed by Craig Mullins, Todd Everett, Toon Koppelaars, Davide Mauri).

USING THIS SITE
- To work around Blogger limitations, the labels are mostly abbreviations or acronyms of the terms listed on the
FUNDAMENTALS page. For detailed instructions on how to understand and use the labels in conjunction with that page, see the ABOUT page. The 2017 and 2016 posts, including earlier posts rewritten in 2017 were relabeled accordingly. As other older posts are rewritten, they will also be relabeled. For all other older posts use Blogger search.
- The links to my AllAnalytics columns no longer work. I re-published only the 2017 columns @dbdebunk, and within them links to sources external to AllAnalytics may or may not work.

SOCIAL MEDIA
I deleted my Facebook account. You can follow me @DBDdebunk on Twitter: will link to new posts to this site, as well as To Laugh or Cry? and What's Wrong with This Picture? posts, and my exchanges on LinkedIn.
------------------------------------------------------------------------------------------------------------------

 Fundamentals

Instead of repeating here what we have have already explained extensively, check them out here:

The Key to Relational Keys: A New Understanding -- Primary Keys

A New Understanding of Keys series

Relationships -- Uniqueness and Attribute Constraints

Don't Conflate/Confuse Primary Keys, PK Constraints, and Indexes

No Such Thing As Primary Key Tuning

Setting Matters Straight

  • "Unique key" is redundant.
  • A key is an attribute, not a field.
  • Be it primary or not, a key identifies a logical tuple, not a physical record (unlike many SQL DBMSs, in a true RDBMS there isn't necessarily a 1:1 correspondence between the two).
  • Entity is a conceptual term, integrity is a logical term.
  • A key (uniqueness) constraint enforces multi-tuple integrity (why?).
  • Keys are components of the logical model, indexes are part of the physical implementation.
  • PKs are sually indexed for performance (this defies physical independence -- why?).
  • PKs are a formal theoretical mandate, PK selection -- when there are multiple candidate keys -- is made on pragmatic grounds.
  • A PK represents an assigned name (which identifies entities), not a property (which describes entities), but not necessarily a DBMS-generated surrogate keys (i.e., DBMS-generated "auto number integer").
  • A uniqueness constraint represents a within-group relationships among all entities that are members of the group (which is a collective property of the group) and, therefore, constrains the relation (not table!) that represents the group. Thus, uniqueness is defined with respect to the group (which is why last name is not a suitable PK).

  • A PK is an attribute, not a value, that identifies each tuple in a relation.
  • NULL is a SQL non-relational feature that is prohibited in relational systems in general, let alone in a PK attribute.

Conclusion

There is much more to relational keys than the conventional wisdom in the industry, or what SQL DBMSs support, with which they are inconsistent, or of which practitioners are unaware.




No comments:

Post a Comment

View My Stats