Thursday, November 21, 2013

Structuring the World With 'Unstructured Data'

Database management depends on structure -- of reality and of data representing it in databases -- which determines the data manipulation and integrity enforcement by database management systems (DBMS).

Having argued this for decades I have been, predictably, quite skeptical of the hype of systems that manage and extract information from so-called "unstructured data," purportedly obviating the need for business modeling and database design. It's also why I've been skeptical of the criticism of SQL-based DBMS as inflexible because they force big data, much of it text, into tabular schemas in which it doesn't fit or that is difficult to envision upfront. 

Saturday, November 16, 2013

Theory Applied Right: N-ary R-tables, NULLs and Keys

In a comment to one of my posts E writes:
E: Codd has based his model on n-ary relations and that is the key mistake he has made ; that leads to complex structure (absolutely not necessary) and situations where there are no values known and as a consequence the need of the concept we know too well -> the null pointers; bin-ary/2-ary relations (smallest possible) are sufficient to express any predicate/sententional formula and there is no possibility to have something like null; if a value is unknow then we do not know it thus it is not a fact for us thus it is not in our database; function is a special case of a relation...
Not accurate.

Saturday, November 9, 2013

Site Update

1. Quote of the Week.
My best advice in all architecture, and platform choices, like RDBMS vs NoSql. The number 1 question, every single assumption you have about the system, the "this is important because X Y X and this that etc etc. Every single "has to be" you have there, expect every single part of it to change. How would you build your system if every "key fact" was expected to change. These key facts, that are supposed to be pillars are actually volatility themselves and need accounted for, not accepted. --LinkedIn

2. To Laugh or Cry?
How can we add employees, dept and location tables in oracle 10g?

3. Online exchanges I participated in

I am referring you back to an item I posted in last update:
What is a Data Model And which “Data Model” do you prefer?
for two reasons: comments were added since then that should be read, some of which belong in the "To Laugh or Cry" category.

4. What do these two items tell you?
Next gen NoSQL: The demise of eventual consistency
Currently search engines are thought of as tools to find text but Ashok Chandra, Microsoft distinguished scientist and general manager of the Interaction and Intent Group at Microsoft Research Silicon Valley, believes people soon will think of search engines as “task engines.”
“Search technology began with words,” says Chandra.  “We built a whole search infrastructure around words. But in this new era of search, we are working with entities, because people think in terms of them, such as a hotel, a movie, an event, a hiking trail, or a person. The Leibniz platform is designed from the ground up to deal in entities, with the goal of making it easier for people to accomplish the tasks they set out to do.”
--A Look Microsoft’s ‘Leibniz’ Platform
BTW, I love "Interaction and Intent Group". Wonderful.

5. And now for something completely different.
 I Challenged Hackers to Investigate Me

Saturday, November 2, 2013

More on E/RM: Still Not a Data Model

In a previous site update I linked to three online exchanges on my post about E/RM and considered a response. Here's some thoughts.
MQ: I find this a strange discussion - is there value other than in the realm of philosophy? If ERMs are used by IT professionals across the world to direct the design and build of the majority of applications guided by standard methodologies, is the view of this argument that these were all build wrongly? Regardless of success? Out of interest, is there a common Relational Modelling tool, that is not also an ERM tool, that models the full Codd definition? Is the inferred conclusion that only the RM models data, and ERM, BM, OOM, BOM, plus any other techniques do not? I think that is a little limiting.