Tuesday, December 24, 2013

Friday, December 20, 2013

Anatomy of a Data Management Project



I've finally found a concrete case to share that demonstrates most of the costly consequences of what happens when you engage in database practice without a good grasp of data fundamentals. A web application developer authored the article describing this case. The developer is competent enough to give an excellent post-facto description of the project that enables assessment but, as is usually the case, fails to associate problems with poor foundation knowledge. That's where I come in.

Saturday, December 14, 2013

Site Update




1. Quote of the Week
If SQL is based on relational algebra which is based on set theory where the concept of null set (empty set) is an axiom of the theory. In this theory empty set is not the same thing as nothing. A point that confuses many people.

Relational algebra is based on 3VL predicates, that is, the answer to any predicate can have three states true, false or unknown. Unknown is caused by the use of a operator on an the absence of a value (null). Within relational algebra null is not to be treated as a value but merely a marker of unknown (absence of a value).

None of this is rocket science and I suggest doesn't result in bad implications. I suggest the so called "bad implications" are only introduced as people use null as a patch for problems for example the division by zero. indeterminate state, open ended ranges, data states to name a few. That is, the issue is not the concept of null but its abuse as a patch for other issues. 
--LinkedIn.com

2. To Laugh or Cry?
Why You Should Never Use MongoDB

3. Online debunking
RELATIONAL DATABASE

4. Elsewhere
Next gen NoSQL: The demise of eventual consistency

5. And now for something completely different
23andMe Is Terrifying, But Not for the Reasons the FDA Thinks





Saturday, December 7, 2013

The "Tyranny" of Knowledge and Reason



WS writes:
I thought Happiness is a glass half empty might be suitable for the "and now for something completely different" part of the site.

Sadly the museum of failed products doesn't contain any software products, but I suspect that if it did, it would contain a lot of products that were going to definitely replace the relational model. I also suspect that many of these products would be reiterations of previously failed products marketed under a different name.

I think there are also some interesting points about how some people believe that their talents are innate and others believe that anything can be learned, given enough time and effort. I wonder if there is some connection between overestimating one's abilities and believing one has an innate talent for some discipline. From my own personal experience I would say that I have often overestimated my abilities when I believed I had a gift for some subject. If I have struggled hard to master something then I generally have a better idea of how much I still have to learn.

I have also found that having a better knowledge of the theory of data management has helped me to see clearly at the start when something is doomed to failure.
If 'replacement' is used in the theoretical sense, I've always claimed that an alternative to predicate logic and set theory as a formal basis for database management is a rather tall order, to put it politely. What is more, if DBMS designers and users don't even know what a data model is, or that database management is impossible without one, what is the chance that some such alternative will emerge?

If used in the implementation sense, we can't really talk of replacement of what was never truly and fully implemented and adopted--SQL is hardly it. As to relabeled reiterations, those who don't know the past...

In general there is an instinct to believe better of oneself than is justified. It is an important objective of intellectual development, of which the scientific method and theoretical foundations are a core element, to bound that tendency and bring overestimation closer to reality. This explains the common disdain of many data professionals who are only tool trained, but not educated, for what I refer to as the tyranny of knowledge and reason, that "robs them of their "freedom" to do whatever they happen to think is best"to achieve the purpose.

The article refers to failed consumer products that "nobody wanted to buy". There is an important difference between most of those and technological foundations on which they are based: either you like them immediately upon use, or you don't--it's a matter of sheer, opinion, preference, or taste. The same attitude to theoretical foundations can lead to serious trouble which will materialize in the long term. It is training without education that tends to induce the notion that nothing but innate talent and practice/experience are sufficient for competence.




Sunday, December 1, 2013

Site Update




1. Quote of the Week (h/t Matt Rogish)
"For quite a few years now, the received wisdom has been that social data is not relational, and that if you store it in a relational database, you’re doing it wrong."
Not that I mind seeing brand-new-fangled, hark-to-the-past obsolete dross like MongoDB and its ilk being recognised for the compost it is...--sarahmei.com
2. To Laugh or Cry?
A Revolutionary Paradigm: The Failure of Relational Database, The Rise of Object Technology and the Need for the Hybrid Database
Brothers, can you spare me the paradigms?


3. Online
What is this Data Integration Innovation

4. No SQL!
NYT Healthcare.gov Project Chaos Due Partly To Unorthodox Database Choice
Administration announces another HealthCare.gov delay

5. And now for something completely different (well, not completely!)


Thursday, November 21, 2013

Structuring the World With 'Unstructured Data'



Database management depends on structure -- of reality and of data representing it in databases -- which determines the data manipulation and integrity enforcement by database management systems (DBMS).

Having argued this for decades I have been, predictably, quite skeptical of the hype of systems that manage and extract information from so-called "unstructured data," purportedly obviating the need for business modeling and database design. It's also why I've been skeptical of the criticism of SQL-based DBMS as inflexible because they force big data, much of it text, into tabular schemas in which it doesn't fit or that is difficult to envision upfront. 

Saturday, November 9, 2013

Site Update




1. Quote of the Week.
My best advice in all architecture, and platform choices, like RDBMS vs NoSql. The number 1 question, every single assumption you have about the system, the "this is important because X Y X and this that etc etc. Every single "has to be" you have there, expect every single part of it to change. How would you build your system if every "key fact" was expected to change. These key facts, that are supposed to be pillars are actually volatility themselves and need accounted for, not accepted. --LinkedIn

2. To Laugh or Cry?
How can we add employees, dept and location tables in oracle 10g?

3. Online exchanges I participated in

I am referring you back to an item I posted in last update:
What is a Data Model And which “Data Model” do you prefer?
for two reasons: comments were added since then that should be read, some of which belong in the "To Laugh or Cry" category.


4. What do these two items tell you?
Next gen NoSQL: The demise of eventual consistency
Currently search engines are thought of as tools to find text but Ashok Chandra, Microsoft distinguished scientist and general manager of the Interaction and Intent Group at Microsoft Research Silicon Valley, believes people soon will think of search engines as “task engines.”
“Search technology began with words,” says Chandra.  “We built a whole search infrastructure around words. But in this new era of search, we are working with entities, because people think in terms of them, such as a hotel, a movie, an event, a hiking trail, or a person. The Leibniz platform is designed from the ground up to deal in entities, with the goal of making it easier for people to accomplish the tasks they set out to do.”
--A Look Microsoft’s ‘Leibniz’ Platform
BTW, I love "Interaction and Intent Group". Wonderful.


5. And now for something completely different.
 I Challenged Hackers to Investigate Me



Saturday, October 26, 2013

Site Update




1. Quote of the Week
Can any one guide me, how to search specific value in all database table? in output we required tables and columns name.
--LinkedIn.com

2. To Laugh or Cry?
Go On, Live a Little. Denormalize Your Data

3. Online

My October post @All Analytics
Big Data Uber Alles
A follow-up exchange to previous exchanges on my post E/RM Is Not a Data Model.
What is a Data Model And which “Data Model” do you prefer
My own follow-up is forthcoming here.


4. I've done some house cleaning and found some links that I accumulated at various points in time that I deemed worth reading, but which fell victims to lack of time. They may be interesting or useful to others, so I list them here.

On SQL:
On relational technology:
General:

5. And now for something completely different: the wastebasket and the barber.




Saturday, October 12, 2013

Site Update




1. Quote of the Week
... MongoDB, or RavenDB ... are excellent for non-relational, loose schema databases. Databases always have schemas, the data is the schema, inherently in dbs like Raven & Mongo.
--LinkedIn.com

2. To Laugh or Cry?
Easy Steps to a Complete Understanding of SQL

3.  Three online exchanges on my E/RM post in which in participated.
Entity-Relationship Model Not a Data Model
Entity-Relationship Model Not a Data Model
Entity-Relationship Model Not a Data Model
The last one is an excellent validation that many data professionals erroneously believe they know and understand the RM--which prevents them from appreciating its value and doing something about it. It also demonstrates that schooling is not education.

I may tackle some of the comments in a future post.


4.  How much would you bet against my suspicion that there is little/superficial, if any, relational
background to this?
SQL Database for Beginners
  
6.  And now for something completely different

 From article on Marissa Meyer:
They say her obsession with the user experience masks a disdain for the money-making side of the technology industry. There is some truth to what they say ... Mayer joined Google as a programmer and rose to become the executive in charge of the way Google search and many other popular Google products looked to web users ... She obsessed over pixels; their hue, shade, and placement. She co-authored a handful of patents, including an important one for Google: "Graphical user interface for a universal search engine." By 2005, Mayer moved into management, overseeing the look and feel of Google's most important products ... But being in charge of how Google products should look, Mayer's job was, basically, to relate with Google's millions of users. How would she do that? ... The first is that she would recreate the technological circumstances of her users in her own life. ...  Mayer's second method was to lean on data. She would track, survey, and measure every user interaction with Google products, and then use that data to design and re-design.
Wow, and after all this the user interface of all Google's online services sucks, they are buggy and there is practically no user support? Imagine what will happen without Marissa Meyer there!

And, oh, the cultural sophistication of the technology elite. Helps understand their output.
2013 Tour de Coop Chickens, Beehives & Homesteads Silicon Valley Funcheap



Sunday, September 29, 2013

Testing Your Foundation Knowledge




Expertise in a field and ability to convey it to others are distinct and the latter requires different motivation, skills and talent. Many top technical experts are more often than not poor communicators, whether verbally or in writing, for some inherent reasons, Codd being an excellent example. That's one of the core reasons for poor foundation knowledge in data management in general, and the appreciation of the relational model in particular.

In a previous post I started a little experiment: I asked both readers who think they know and understand the relational model (RM) and those who do not but want to, to comment on whether a theoretically correct explanation of data fundamentals offered by reader PK was helpful and, if not, why not. I promised to draw some conclusions regarding the difficulty of dispelling misconceptions without losing either theoretical rigor, or the audience--a non-trivial task for an educator in an industry that deems theory impractical.

I can't say the response exactly answered my question (I recommend reading the comments, though). But let me, as promised, try my hand at making better sense of both the explanation and the comments (for an in-depth treatment see paper #1, Business Modeling for Database Design). Let me know if it helps..

Sunday, September 22, 2013

Site Update




1. Schedule reminder
September 23rd, 10:00am, San Francisco, CA
The CWA, Missing Data and the Last NULL in the Coffin
Presentation, Oaktable Conference, Oracle OpenWorld
October 8, Milan, Italy
Denormalization for Performance: A Costly Illusion
Public presentation, UGISS SQLSaturday
October 9-10, 2013, Milan, Italy
Business Modeling for Database Design
Private seminar sponsored by Microsoft and organized by SolidQ
Contact: Davide Mauri, SolidQ

2. Quote of the Week
I am constructing a new website ... using node.js. Its aim is to have many subscriber (people who offer help and people who need help) it should be scalable in different language. I have to decide wich is the more suitable db. I am thinking about to have two db (mongodb and postgress) for site languages and people account, people should vote other people ability. As db experts could you give me some suggestions? What would think could be a good db choice?
--LinkedIn.com

3. To Laugh or Cry?
Can anyone guide about using DB2

4. Online

 5. There were several posts on this site about Meijer article, its support by a letter to the editor and reactions by David McGoveran and C. J. Date to both. But I missed the one by my fellow relationlander Erwin Smout: A letter by Carl Hewitt. At one point he writes:
At any rate, I'm still left wondering what mr. Hewitt's problem is here.
I don't know why he wonders -- it is pretty obvious to me.


6. The frequency of fads has been increasing and the time between them decreasing. Today pushing a "new thing" starts before the last fad is exhausted: The Next Wave of Data Management


7. And now for something completely different: How the US Crushed Youth Resistance




Sunday, September 8, 2013

Site Update





1. Schedule reminder
September 23rd, 10:00am, San Francisco, CA
The CWA, Missing Data and the Last NULL in the Coffin
Presentation, Oaktable Conference, Oracle OpenWorld
October 8, Milan, Italy
Denormalization for Performance: A Costly Illusion
Public presentation, UGISS SQLSaturday
October 9-10, 2013, Milan, Italy
Business Modeling for Database Design
Private seminar sponsored by Microsoft and organized by SolidQ
Contact: Davide Mauri, SolidQ

2. Quote of the Week
Q: One of the main resistences of RDBMS users to pass to a NoSQL product are related to the complexity of the model: Ok, NoSQL products are super for BigData and BigScale but what about the model?

A: Actually graphs are the way we (people) think and organization data in our head, as computer people it is on[e] of the most popular way[s] we are taught to think about data, so this should be natural.
--slideshare.net

3. To Laugh or Cry?
"Splunk for Big Data"

4. My comment at Robert Young's blog
No Mas!! No Mas!!

5.
Something I argued much before they did.
Think Big Data Is All Hype? You're Not Alone

5. And now for something completely different.
High-tech toilets vulnerable to hackers
No comment.



Sunday, August 25, 2013

Site Update




1. Schedule update
September 23rd, 10:00am, San Francisco, CA
The CWA, Missing Data and the Last NULL in the Coffin
Presentation, Oaktable Conference, Oracle OpenWorld
October 8, Milan, Italy
Denormalization for Performance: A Costly Illusion
Public presentation, UGISS SQLSaturday
October 9-10, 2013, Milan, Italy
Business Modeling for Database Design
Private seminar sponsored by Microsoft and organized by SolidQ
Contact: Davide Mauri, SolidQ

2. Quote of the Week
How many software programs are mathematically provable. And yet everybody still writes software and for the most part it works. Relational theory and SQL was very important for establishing a standard across vendors to a point. And yet switching relational database vendors is still very expensive proposition because the standards don't address the features that users need and use everyday that are not part of the standard. At the heart of the system the relational model can still be enforced. But a product lives and dies not on whether it is mathematically provable but it's features set, efficiency and cost to develop in.
--LinkedIn.com

3. To Laugh or Cry?
Please help with my data model design
If this was student homework, it is an excellent example of how database management should not be learned and a validation of the substitution of the "cookbook approach" for education. Ironically it's in the forum's section "Relational theory". Had theory been taught, such questions would have not been asked. 


4. Two online exchanges I participated in
Predictable--it was just a matter of time. My latest post at All Analytics is quite apropos: Real Data Science: General Theories of Data.
In this context, consider In Silicon Valley, age can be a curse.


5. And now for something completely different

Not entirely unrelated:
Facebook boosts connections, not happiness study
The Curse of Self-Service (h/t Davide Mauri)




Sunday, August 11, 2013

Site Update




A while ago my friend Stephen Henley published his opinion on Missing Data, which questioned the thoughts--not well formed and definitive at the time--of C. J. Date, Hugh Darwen and myself on the subject. Since then Date has proposed a default values scheme which he has subsequently renounced; Darwen has published How To Handle Missing Information Without Using NULL and I proposed a relational solution in the recently revised paper #3, The Last NULL in the Coffin.

In this context, I dedicate this update (except the last item) to NULL. Whatever difference may exist among the above mentioned relational proponents, we do agree that it is certainly not a solution to the problems of missing data.

Time permitting, I may post some belated comments on Henley's piece.


1. QUOTE OF THE WEEK
If SQL is based on relational algebra which is based on set theory where the concept of null set (empty set) is an axiom of the theory. In this theory empty set is not the same thing as nothing. A point that confuses many people.

Relational algebra is based on 3VL predicates, that is, the answer to any predicate can have three states true, false or unknown. Unknown is caused by the use of a operator on an the absence of a value (null). Within relational algebra null is not to be treated as a value but merely a marker of unknown (absence of a value).

None of this is rocket science and I suggest doesn't result in bad implications. I suggest the so called "bad implications" are only introduced as people use null as a patch for problems for example the division by zero. indeterminate state, open ended ranges, data states to name a few. That is, the issue is not the concept of null but its abuse as a patch for other issues. 
--LinkedIn.com

2. TO LAUGH OR CRY?

Why shouldn't we allow NULLs?, stackexchange.com


3. An ONLINE exchange I participated in.

NULL Handling in Databases, LinkedIn.com


4. And now for something completely different.

An astonishing act of statistical chutzpah
Why Great Teachers Are Fleeing the Profession
The ABCs of MOOCs

What does this say about the educational system?




Tuesday, July 30, 2013

The Final NULL in the Coffin: A Relational Solution to Missing Data




Order via the PAPERS page


NEW! THE FINAL NULL IN THE COFFIN: A RELATIONAL SOLUTION TO MISSING DATA NEW!

v.3 (August 2013)

The relational data model is based on the two-valued logic (2VL) of the real world: every proposition about the real world is unequivocally true or false. But our knowledge of the real world is usually imperfect—some data is missing—which means that we don't always know whether propositions are true or not; 2VL no longer applies and data integrity and database query results are no longer guaranteed to be enforceable and provably logically correct with respect to the real world.

Missing data has possibly been the thorniest aspect of database management: without a logically sound yet practical solution, data professionals and users are left between a rock and a hard place. They must either (a) rely on SQL's arbitrary and flawed implementations of three-valued logic (3VL) based on NULLs and risk results that are easy to misinterpret, or erroneous in ways hard to discern, or (b) undertake in applications a prohibitively complex, error prone and unreliable burden that belongs in the DBMS.

This paper illustrates some of the drawbacks of the many-valued logic (nVL, n > 2) approach to missing data and SQL’s NULL scheme and proposes a solution within the 2VL/relational framework that:
  • Guarantees data integrity and logically correct query results;
  • Avoids the complications and problematics of nVL/NULL's;
  • Requires no changes to the relational model;
  • Is largely transparent to users;
  • Keeps users better apprised of the existence and effects of missing data.
The proposed solution requires research into its implications for data manipulation and integrity enforcements before it is implemented, but we believe it is theoretically sound and implementable in a truly relational DBMS (TRDBMS) using technologies that, unlike SQL, support full physical data independence e.g. the TransRelational™ Model (TRM).


Table of Contents
  • Introduction
  • "Inapplicable Data”: Nothing's Missing
  • Missing Data: Into the Unknown
  • SQL’s 3VL: NULL
  • Known Unknowns: Metadata
  • A 2VL Relational Solution
  • The Practicality of Theory
  • 2VL vs. NULL in the Real World
  • Relation Proliferation
  • The TransRelational™ Model
  • Conclusion
  • Some Misconceptions Debunked
  • References




Sunday, July 28, 2013

Site Update




1.
Some housekeeping. The posting to the blog and multiple static pages is a bit of a hassle. I am also facing some work on my seminars and papers. Until further notice:
  • There will be one post/week--alternating articles and Site Updates (I may skip the latter on certain weeks, if absolutely necessary);
  • Quotes and links to LAUGH/CRY? and FP ONLINE will be posted directly into Site Update posts (like below); the respective static Pages will be updated at the end of each month.
Some tool that would automate posts and updates in one shot would have helped. I looked into it, but for various reasons (including Google's Blogger updates), nothing is available (if you know of any, preferably from experienc, please recommend).

2.
Quote of the Week:
...the relational model has no relationships since Codd decreed that all relationships must be represented by foreign keys, which are exactly the same as "attributes" ... Consider if we had a bunch of tables, each containing the thing A. Now what is the population of A? It cannot be found in any one of the tables. It is actually the union of all the populations of A plus more if we allow A to exist (i.e., be of interest to us) but does not appear in any of the tables. That would be the case of a master reference list of "codes" for which we would then build a separate table. But even that is insufficient. We would also have to define and enforce referential integrity everywhere an A appeared. All of this is handled explicitly and correctly in ORM -- we model objects (each one appears only once in a data model diagram) and relationships. There are no attributes. As I said before, an attribute is an object playing a role in a relationship with another object.
--LinkedIn.com
3.
To Laugh or Cry?
What’s the Best Way for Structured Data Computing in Java?
4.
FP Online:
Let's innovate....database
5.
Good advice:
Designing a Database: 7 Things You don't Want To Do
But why it bothers me?

6.
And now for something completely different.
NSA claims inability to search agency's own emails
Clueless doctor sleeps through math class, reinvents calculus…and names it after herself. At least the doctor re-invented something in a different field. Data professionals do it all the time in their own field.
You can't make these things up.



Monday, July 15, 2013

Site Update (UPDATED)




07/19/13: I have also added my latest post at All Analytics to the FP ONLINE page.

07/18/13: This update referred to items that were erroneously dated 7/3/13 instead of 7/15/13. This has now been corrected. 



1.
The 'Quote of the Week' was posted   to the QUOTES page.

2.
A 'To Laugh or Cry' item was posted on the LAUGH/CRY page.

Everything should be as simple as possible, but not simpler.
--Albert Einstein

3.
A link to an exchange I participated was posted on the FP ONLINE page.

4.
And now for something completely different.

If You Search, Advertise on, Invest in, or Have Kids Who Use Google, You Must See

Too much power is always dangerous, no matter who holds it.




Monday, July 8, 2013

Relational Theory and Database Practice




I shared the links to my recent three-part series on foreign keys (and integrity constraints in general) on LinkedIn. Comments on the second installment raised an important issue about keys (discussed in more depth in Business Modeling for Database Design), which deserves attention.
NK: Let me first affirm my position that I believe foreign keys are the fundamental bases on which relational database managements system operate. Foreign keys provide the relationship in database normalization. Foreign keys are like the framework of a building structure. While some developers may have the notion that constraints and integrity checks can be handled better at the application layer, I would want to refer them to tools like ER Studio, ERWIN, and Visual Studio ... A good database design starts at the logical design level. Abstracting constraints and integrity checks from this layer to the application layer can lead to corrupt database designs. A simple case in point; How would you enforce a unique constraint on a table with 10 million rows? Will it make better sense to have a unique index on the table\field or have the application layer enforce the constraint?

Saturday, June 29, 2013

Site Update




1.
On the SCHEDULE:

A private database design seminar, October 9-10, Milan, Italy (sponsored by Microsoft and SolidQ)

A public presentation to the SQL Server User Group Italy (UGISS), October 8, Milan, Italy.

Details forthcoming. Contact Davide Mauri @SolidQ.


2.
The 'Quote of the Week' is an online question that is too long to post to the QUOTES page, so I posted the link to the exchange it initiated.


3.
A 'To Laugh or Cry' item was posted on the LAUGH/CRY page.

Remember my claim that we are regressing to this?


4.
A link to an exchange I participated was posted on the FP ONLINE page.


5.
Life and Work of Ted Codd (YouTube)

Is everything accurate?


6.
And now for something completely different.

How the Hum of a Coffee Shop Can Boost Creativity

The logical conclusion and real risk of digitizing everything in sight.


Tuesday, June 25, 2013

Data Model: Neither Business, Nor Logical, Nor Physical Model




Note: For a more in-depth discussion see Business Modeling for Database Design.

Chris Date once wrote a paper titled Models, Models Everywhere, Nor Any Time to Think, deploring the confused and distorted way in which fundamental concepts and terminology in general and relational ones in particular, are used in the industry. But no matter how many times a misconception is debunked, the abuse continues and will do so given educational failure and disregard for precision. Data model is a case in point (see Unmuddling Modeling, Parts 1,2) and What Is a Data Model?)

Thursday, June 20, 2013

Site Update




1.
The following were added to the SCHEDULE:
  • Private database design seminar, October 9-10, Milan, Italy (sponsored by Microsoft and organized by SolidQ)
  • Public presentation to the SQL Server User Group Italy (UGISS), October 8, Milan, Italy, organized by SolidQ.
2
The 'Quote of the Week' was posted on the QUOTES page.

3.
My latest All Analytics column was posted on the FP ONLINE page.

Two of the previously posted exchanges have new comments:
Different Types of DBMS
Comments on my Foreign Keys, Part 2 The Costs of Application-Enforced Integrity

4.
A 'To Laugh or Cry' item was posted on the LAUGH/CRY page.

5.
A link to an exchange I participated was posted on the FP ONLINE page.

6.
And now for something completely different
European data protection watchdogs are closing in on Google, with Spain charging the software giant with six legal infringements punishable by up to €1.5m (£1.3m) in fines, while France has given it three months to rewrite its privacy policy.
...
On the same day, France's Commission Nationale de l'Informatique et des Libertés (CNIL) gave Google formal notice that it risks a fine of up to €150,000 and a second of €300,000 if it fails to rewrite its privacy policy within three months.
--Google and privacy: European data regulators round on search giant
I'm sure this will put stop to abuses cold.



Friday, June 14, 2013

Site Update




1.
The 'Quote of the Week' was posted on the QUOTES page.

2.
A 'To Laugh or Cry' item was posted on the LAUGH/CRY page.

3.
A link to an exchange I participated was posted on the FP ONLINE page.


4.
And now for something completely different.

The United States Is Still in an Extraordinarily Good Position 

Certainly true from his perspective.

Banks Reap Profits on Overdraft Fees as Customers Lose Money 

See what I mean?

The World Isn't Fair

He should know. And it works because he

Augmented Reality vs. Decimated Reality

helps ensuring we stay like this:

The laughable innocence of Facebook and Google (and us)

Incidentally, believing in corporate innocence is what public innocence is all about.





Thursday, June 6, 2013

Site Update




1.
The 'Quote of the Week' was posted on the QUOTES page.

2.
A 'To Laugh or Cry' item was posted on the LAUGH/CRY page.

3.
A link to an exchange I participated was posted on the FP ONLINE page.
Great idea, skeptical about success-it is against the societal grain. Societies are interested in conformism, not critical thinking.

4.
If this and many other such improvements are possible, what is the justification for still focusing on "denormalization for performance"?

In the same vein, why is Michael Stonebraker referring to "legacy relational DBMSs", while demonstrating that the performance limitations and solutions of current SQL systems have actually absolutely nothing to do with their being relational (which, in fact, they are not)? Indeed, everything is about implementation--how could it be otherwise? And he is one of the people who does know the fundamentals! Ah, yes, he is a vendor now.
  
To his credit, he rejects NoSQL for the right reasons and his solutions to the today's performance needs are sensible. But why does he want to preserve SQL, rather than come up with a TRDBMS? All those solutions, don't they validate our claim, for decades, that such a system can be excellent performer? Why, as an implementor, he did not design one?

Note: I happen to know what the solution is for the performance factors for which he does not have any, but unfortunately cannot say anything about it (it's deja vu TransRelational Model(TM) all over again).

5.
And now for something completely different.

Twitter's identity crisis

Zynga to lay off 18 percent of staff, shut offices, slash infrastructure

Facebook loses advertisers again

See a pattern? No? Does the following help?

Yahoo Shuts Down Mail Classic, Forces Switch To New Version That Scans Your Emails To Target Ads

How about this:

America, It's Time to Start Making Things Again

If you still don't, I've hinted about it in my last All Analytics post (see FP ONLINE PAGE). It was predictable.




Thursday, May 30, 2013

Site Update




1.
The 'Quote of the Week' was posted on the QUOTES page.

2.
A 'To Laugh or Cry' item was posted on the LAUGH/CRY page.

3.
A link to an exchange I participated was posted on the FP ONLINE page.

4.
And now for something completely different.

When I was told that a psychology professor at a prestigious university went to a Deepak Chopra retreat I was shocked that I am no longer shocked about such things. And then this:

Class of 2013 The Future of Leadership

Looks like leadership does not have much future, but I am certainly glad I am old enough not to experience it.



Friday, May 24, 2013

Site Update



1.
The 'Quote of the Week' was posted on the QUOTES page.

2.
A 'To Laugh or Cry' item was posted on the LAUGH/CRY page.

3.
A link to an exchange I participated was posted on the FP ONLINE page.

4. From  "Top 12 reasons why you should not attend the NoCOUG conference tomorrow":
#10 They talked up SQL for 25 years but now, they’re all, like, “No SQL.” I mean, really!
It's difficult to joke in the database field and not touch something serious.

5.
Not unrelated: This question did not get any answers. Doesn't anybody read anymore? On the other hand, given what's published these days, I wonder if that's as bad an idea as it used to be.


Thursday, May 16, 2013

Site Update



1.
My keynote address at the Northern California Oracle User Group Spring 2013 conference added to the SCHEDULE.

BTW: If you live in San Francisco, attend the conference on 5/22 and can give me rides to and/or from Pleasanton, or know somebody who can, it will be greatly appreciated. Please email me at the address on the ABOUT page.

2.
The 'Quote of the Week' was posted on the QUOTES page.

3.
A 'To Laugh or Cry' item was posted on the LAUGH/CRY page.

4.
A link to my latest All Analytics column was posted on the FP ONLINE page.



Friday, May 10, 2013

Site Update



1.
My keynote address at the Northern California Oracle User Group Spring 2013 Conference is on the SCHEDULE page.

BTW: If you live in San Francisco, attend the conference on 5/22 and can give me rides to and/or from Pleasanton, or know somebody who can, it will be greatly appreciated. Please email me at the address on the About page.

2.
The 'Quote of the Week' was posted on the QUOTES page.

3.
A 'To Laugh or Cry' item was posted on the LAUGH/CRY page.

4.
A link to an online exchange I participated in was posted to the FP ONLINE page.

I will probably address some of the issues on my All Analytics blog. Stay tuned.

5.
San Jose State Philosophy Dept. Criticizes Online Courses

Didn't I tell you so?

6.
Google Aims To Patent Policy Violation Checker, Potentially Revolutionizing Email Snooping

Any organization that grows beyond a certain size and gains a certain level of market dominance -- what is called 'institutional power' -- is not any different than an oppressive government. One of the indicators of reaching that level is the creation of a lobbying arm and gradually increasing the focus on it, as well as for the disregard of the public.

There was IBM, then Microsoft, now it's Google and Facebook. But a significant difference between the former two, other than arrogance due to corruptive power which is common to all dominant corporations and the latter two, is the nature of their business models. Exclusive reliance on advertising, whose profitability inherently decreases with time pushes  into ever more evil behavior in order to sustain grows and profitability.

7.
A Google search that hit my site:

"which is better, a highly normalized database or a database structure that makes end user data acces".


Sunday, May 5, 2013

Theory: As Far From Religion As One Can Get



In So What is a 'Large Database' JS states:
The points you make here, and consistently ... center pretty clearly on distinction between logical models and physical implementations. Products that sacrifice the logical model for various practical considerations (speed, size, cost, etc. - at least in the short term), reinforce the general lack of focus on, or understanding of, the relational model, as well as diminishing appreciation of the distinction betweenlogical and physical.
Physical data independence (PDI) is, indeed, a core advantage of the relational model, but hardly the only one I have focused on over the years. And the relational model is hardly the only component of the foundation knowledge that is increasingly lacking in the industry.

Thursday, May 2, 2013

Site Update



1.
Details of my keynote address at the Northern California Oracle User Group Spring 2013 conference is on the SCHEDULE page.

BTW: If you live in San Francisco, attend the conference on 5/22 and can give me rides to and/or from Pleasanton, or know somebody who can, it will be greatly appreciated. Please email me at the address on the ABOUT page.
 `
2.
The 'Quotes of the Week' were posted on the QUOTES page.

3.
A 'To Laugh or Cry' item was posted on the LAUGH/CRY page.

Carl Hewitt's "response" to Date and McGoveran letter to the editor criticizing  his previous nonsense. Incidentally, somebody Googled "chris date mcgovern [sic] carl hewitt" and here's the blurb that comes up:
Carl Hewitt - Wikipedia, the free encyclopedia
Carl Hewitt is Board Chair of the International Society for Inconsistency ... which was developed in the early 1970s by Sussman, Hewitt, Chris Reeve, and David ...
4.
A link to an online exchange I participated in was posted to the FP ONLINE page.

5.
The first installment of my Debunking Corner for the Northern California Oracle User Group Journal Spring 2013 issue has been published. A link to the journal PDF was posted on the FP ONLINE page (scroll down).

6.
The Costly Illusion: Normalization, Integrity and Performance paper has been revised to correct an error (see Understanding Further Normalization: 2NF).

7.
A Bing search that hit my site: "optimal database for complex xml schemas nosql". I don't think that's what the author had in mind.

8.
From a LinkedIn Profile:
Mary Hart
B2B Tech Marketing Copywriter/Professional Liar, Greater Boston Area
I would appreciate the honesty but for for the logical paradox.


Sunday, April 28, 2013

Tables, Full Normalization and Business Rules



REVISED: 10/16/16
 

Often somebody produces a table and asks if it is fully normalized (in 5NF) and, if not, in what normal form it is. This is an indication of poor grasp of data fundamentals.

Consider ASSIGNMENTS:
 EMP# ENAME    PROJECT       DEPT#
===================================
 100  Spenser  Sys Support   E21
 100  Spenser  Comp Svcs     E21
 100  Spenser  Supp Svcs     E21
 160  Pianka   Info Center   D11
 310  Setright Documentation D11
 310  Setright Mfg Systems   D11
 150  Adamson  Info Center   D11
-----------------------------------
First, a normal form is a property of a relation, not a table (a R-table is only a "visual shorthand" for a relation -- a special kind of table that visualizes a relation on some physical medium (e.g., paper) -- and the two should not be confused.

Second, the normal form of a relation is determined from attribute dependencies. Formally, a relation is fully normalized (in 5NF) if and only if the only dependencies that hold in it are functional dependencies (FD) of the non-key attributes on the key (i.e., there is exactly one value of each non-key attribute for every key value, but not vice-versa). Since a key represents an entity identifier, this condition exists only when, informally, a relation represents entities of a single type (why?) Is this true for the relation pictured by ASSIGNMENTS?

The fact is that whether a relation represents a single type of entity -- and, therefore, is fully normalized -- cannot be ascertained from sheer visual inspection of the table picturing it. It requires knowledge of what the underlying relation means, namely the type(s) of entity specified by the business rules in the corresponding conceptual model that the relation represents.

For example, if the rules:

  • R1: Every employee is identified by an employee number.
  • R2: Every employee has an employee name.
  • R3: Every employee works in a department.
  • R4: Every project assignment is identified by an employee number and a project name.
model two types of entity:
  • Employees: {emp. number} --> {employee name, department number}
  • Project assignments: {employee number, project}
then the relation represents both and, consequently, is not in 5NF. In fact, without a well defined and complete conceptual specification of entity types, you can't even tell whether relations have keys (if they do not, they are not relation) and, if they do, what the key is.

Database design adhering to the Principle of Full Normalization (POFN) do not "bundle" entity types and produce 5NF relations, obviating the need for further normalization. For the advantages of full normalization and the drawbacks of "denormalization for performance" illusion, see paper #2 and the recently published DBDEBUNK GUIDE TO MISCONCEPTIONS ABOUT DATA FUNDAMENTALS.

Explicit further normalization is necessary only to "repair" poorly designed non-5NF relations by replacing them 5NF projections. For example, ASSIGNMENTS with EMPLOYEES and PROJ_ASSIGNS pictured by R-tables:

EMP# ENAME    DEPT#
====================
 100  Spenser  E21
 160  Pianka   D11
 310  Setright D11
 150  Adamson  D11
--------------------

 EMP# PROJECT
=================  
 100  Sys Support
 100  Comp Svcs
 100  Supp Svcs
 160  Info Center
 310  Mfg Systems
 310  Comp Svcs
 150  Info Center
------------------
each representing a single entity type. The repair is possible because the following holds:
ASSIGNMENTS{EMP#,ENAME, DEPT#} JOIN ASSIGNMENTS{EMP#,PROJECT} = PROJ_ASSIGNMENTS
where the left-hand side is a join of two projections of ASSIGNMENTS (i.e., no information is lost).




Thursday, April 25, 2013

Site Update



1.
My keynote address at the Northern California Oracle User Group Spring 2013 conference was added to the SCHEDULE.

BTW: If you live in San Francisco, attend the conference on 5/22 and can give me rides to and/or from Pleasanton, or know somebody who can, it will be greatly appreciated. Please email me at the address on the About page.

2.
A link to my latest All Analytics column was posted on the ONLINE page.

Incidentally, since with the discovery by business of analytics as some sort of "new data science", overnight born-again BI experts proliferate like frogs after heavy rain. It suggest a similar poverty of foundation knowledge and rich debunking targets.
Please submit any pearls you come across that could be targets interesting from a data perspective.

3.
The 'Quote of the Week' was posted on the QUOTES page.

4.
A 'To Laugh or Cry' item was posted on the LAUGH/CRY page.

Many years ago I wrote something about what I called the "kitchen sink" approach to data management, but this one takes the cake. All the following are included:
  • Key-value pair programming language
  • Entity Attribute Value database model
  • Relational Database Management System, specifically Postgres 9
  • Objects and object metadata
  • SQL client interface (returns objects of various types)
  • Procedural SQL [FP: Huh?]
  • Schema of "Sprout data model" [FP: Wonder what that is]
  • Objects (tables, views) are accessed with their resource identifier
  • High level syntax-independent [FP: Wow!!!!]
and much more (check out, in particular, the bulleted list of features).

5.
A link to an online exchange I participated in was posted to the FP ONLINE page.

6.
Consider the topics in Jonathan Lewis' Oracle Mechanisms Webinar in the context of my argument that, given so many physical/implementation factors that affect performance, why the instinct to attribute poor performance to (logical) denormalization?  And there are many more than those tackled by Jonathan.

7.
While checking hits to this site, I noticed that one of them was due to the  following Google search: "My data model is a better model of reality than your data model. What would your response be?"

Well?



Thursday, April 18, 2013

Site Update



1.
My keynote address at the Northern California Oracle User Group Spring 2013 conference was added to the SCHEDULE.

2.
A link to my latest All Analytics column was posted on the ONLINE page.

3.
The Quote of the Week was posted on the QUOTES page.

There was a comment to my recent Un-muddling Modeling, Part 1 that the conceptual and logical models do not require the relationship concept. However, this does not mean we cannot refer to relationships that are implicit in the models and that is usually in response to arguments like this one.

4.
A 'To Laugh or Cry' item was posted on the LAUGH/CRY page.

Nokia Entertainment: Why we went Mongo

An excellent example of how products are selected in the absence of foundation knowledge.

Related.

Ideas to integration data sets from structured and unstructured data

Bay Area coding boot camps promise to launch tech careers   
SAN FRANCISCO -- Looking for a career change, Ken Shimizu decided he wanted to be a software developer, but he didn't want to go back to college to study computer science.

5.
A link to an online exchange I participated in was posted to the FP ONLINE page.

How to I create a logical data model for Geospatial Data?

6.
Big Data Is Just For Big Companies - And Other BS

There are two related core cycles in IT: centralization/decentralization/re-centralization and corporatization/democratization/re-corporatization.

7.
I have often referred to the difficulty of conveying informally the formal without losing either the rigor, or the audience. David Portas, one of the few knowledgeable practitioners, demonstrates some of that difficulty in his comments to the following post by Hugo Kornelis: NULL - The database's black hole

8.
Enjoy.

Big Data Dilbert


Wednesday, April 10, 2013

Site Update



1.
My keynote address at the Northern California Oracle User Group Spring 2013 conference was added to the SCHEDULE page.

2.
The 'Quote of the Week' was posted on the QUOTES page.

3.
A 'To Laugh or Cry' item was posted on the LAUGH/CRY page.

4.
Links to online exchanges I participated in were posted on the FP ONLINE page.

5.
I agree with most of Cary Millsap's take on NoSQL and Oracle, Sex and Marriage, but note that there is no reference whatsoever to the implications of the data models involved.

6.
After First Great Blunder Refuted consider my Type vs. Domain and Class.

7.
Another job description: Analytics- Data Modeler. Any idea why I post such?



Sunday, April 7, 2013

More on Relational Denial



Note: What follows are my comments on a LinkedIn exchange, So What is a 'Large Database'? Minor edits of the online comments for grammatical, clarity, precision and coherence purposes are within square brackets.
PS: No doubt Oracle/SQL Server/etc are designed and optimized to deal with normalized data. That's where the power lies. They're like Sirens though ... those who don't respect them with proper designs are destined to have performance crashes (bear with me on this metaphor will ya? :)

Sunday, March 31, 2013

Site Update



1.
The 'Quote of the Week' was posted on the QUOTES page.

2.
A 'To Laugh or Cry' item was posted on the LAUGH/CRY page. Many pronouncements exhibit poor grasp of foundation knowledge.  It is a rich target for  debunking, so I may tackle it at some point. You may want to test your mastery of fundamentals before I do.

3.
Links to online exchanges I participated in were posted on the FP ONLINE page.

4.
Job description: Solutions Developer. No comment.

5.
An oldie but goodie republished: Leonardo Was Right!


Thursday, March 28, 2013

Social BigData and Relational Denial



Note: Minor edits 3/29/13.

In an online discussion initiated by the question Does It Matter If Data is BIG or not? MQ commented:
I still feel the discussion around Relational Modelling is confusing the point, and should be put aside until the problem is understood. If a company came to me and said 'Help me solve my big data issue - I have a billion emails I want to analyse' my answer is not 'just create a logical model using relational model theory' because this does not supply an answer. I will make more ground if I say 'right, lets discuss what this is, what technology you have, where the fail points and choke points are, etc and model (relational model) that as part of the process'.
I've built data models for 25 years (all levels) and firmly believed in Relational Theory across this entire period, so I am not saying drop Relational Models, just saying don't start there. Interestingly, I don't get any backlash against relational modelling using this approach - so perhaps the issues mentioned are about how the concept is sold to clients (a weapon rather than an intellectual concept)?

Sunday, March 24, 2013

Site Update



1.
The 'Quote of the Week' was posted on the QUOTES page.

2.
A 'To Laugh or Cry' item was posted on the LAUGH/CRY page.

Here's an exchange between its author and myself:
I was thinking about database skills. When I started in the nineties, and systems were moving from Cobol to RDBMS, database and design skills were really valuable. Now, as much as we know that DB skills are important and valuable, it seems to be the GUI that is more important. At least in management's mind.  :(  At my alma mater, no student is studying the database track now. They either do app development, or networking. When does this mean for the future?
That is why the new generation of products are either applications and files, or labeled DBMSs when they are really application-specific DBMSs. Many practitioners do not distinguish between DBMS, applications, network and OS functions. It's all one big lump.
To which I add: This is also the reason application developers like object-orientation: it is a programming, not data "paradigm".

3.
Links to online exchanges I participated in were posted on the FP ONLINE page.

4.
The following Advanced Database Design and Implementation - Course Outline requires no comments for the reader possessing foundation knowledge. It is good evidence of academia pursuing industry fads rather than leading it with science.
This year the course will examine the following two contemporary fields in the database systems area: XML Data Model and XML Databases and Data Warehousing.

XML Data Model and XML Databases will comprise approximately 65% of the course. There, we shall consider topics such as: XML documents, Document Type Definition (DTD) and XML Schema, XML constraints, XML query languages, Types of XML Databases, Mapping XML data to relational databases, Publishing relational databases as XML documents, and what research is going on in the XML database area. The practical experience will be achieved through the use of XML processors like xmllint and the native XML database management system eXist.
...
By the end of the course, students should be able to:
  • Design well formed XML documents that are valid with regard to a given DTD or XML Schema and thus develop the ability to solve practical engineering problems (BE graduate attribute 3(f)),
  • Analyze a part of the real world and design a corresponding XML DTD or Schema in XML normal form and thus develop the ability to formulate and build efficient models of complex systems using principles of engineering science and mathematics (BE graduate attribute 3(b) and BE graduate attribute 3(c)),
  • Design faithful models of a part of the real world using XML database constraints and thus develop the ability to apply mathematical and engineering science in solving engineering problems (BE graduate attribute 3(a)),
  • Use available web sources to learn about the eXist XML database management system and define XQuery queries and XUPdata updates of an almost arbitrary complexity against a native XML database and thus develop the ability to look for additional information from pertinent sources (BE graduate attribute)...
Interestingly, the pre-requisite is familiarity with    "Relational Data Model, Structured Query Language (SQL), Relational Functional Dependencies and Normal Forms, PostgreSQL Data Base Management System."

Can't figure out why.

5.
On the lighter side: Models and met-a-date.



Tuesday, March 12, 2013

Site Update



1.
The Quote of the Week was posted on the QUOTES page.

2.
A 'To Laugh or Cry' item was posted on the LAUGH/CRY page. The perils of online business modeling and database design and the time and effort imposed by the absence of foundation knowledge.

3.
Links to online exchanges I participated in were posted on the FP ONLINE page.

4.
The SCHEDULE page is now displaying an online monthly calendar which will be updated with my public seminars/lectures, with links to the details. The direct link is
http://pub11.bravenet.com/calendar/show.php?usernum=894201442.

5.
Recommendations:
  • Added Nijssen's CONCEPTUAL SCHEMA AND RELATIONAL DATABASE DESIGN to the recommended books (available via the home page). It is, in my opinion, the best that can be done at the informal conceptual/business level.
  •  Mosley, B., and Marks, P., Out of the Tar Pit. A good read on complexity and the benefits of the relational model (h/t Eric Kaun). 
6.
Miscellaneous:
  • Somebody was endorsed for 'Thought Leadership'. I guess this reflects the increasing rarity of thinking and thinkers. Time to appoint Chief Thought Officers.
  • Solutions Developer. An excellent example of the factotum approach to hiring and the exclusive demand for tool experience. Consider the probability that one person can be sufficiently competent in all the tools, without any guarantee of foundation knowledge. Related: A Data Warehouse quiz.
  • Making Friends with Science provides some context for the previous two items:
Making friends is truly the beginning of making lasting memories. To make friends with science is truly to start with making good friends that make lasting memories about science. I'm starting a new revolution in the way science will be made socially for the community and ask the community to step in and help make science fun, engaging, real, social and most importantly lasting friendships.


Thursday, March 7, 2013

Site Update



1.
The Quote of the Week was posted on the QUOTES page.

2.
A 'To Laugh or Cry' item was posted on the LAUGH/CRY page.

3.
Links to online exchanges I participated in were posted on the FP ONLINE page.

4. 
Looking for non-proprietary reference Semantic Data Model of Distribution Requirement Plan 

Is there any standard LDM exists for Automotives like CLDM or FSLDM 

Require database for banking customers  

Database table normalization

Detect a pattern? 

5.
My predicted consequences of the BigData and BI fad come to pass. On the one hand: 

Big Data News Roundup From Porn to Data-ism

On the other:

Trends Shows Problems of Big Data Without Context

What is the purpose of DENSITY in STATISTICS


View My Stats