Half a croissant, on a plate, with a sign in front of it saying '50c'
h a l f b a k e r y
Fewer ducks than estimates indicate.

idea: add, search, annotate, link, view, overview, recent, by name, random

meta: news, help, about, links, report a problem

account: browse anonymously, or get an account and write.



Query-first database

Most SQL databases use B+ trees
  (+2, -1)
(+2, -1)
  [vote for,

Imagine a deliberately limited database that can only run queries that you specify that you will run.

From this information (and limitation) it can use this information to pick data structures that maximize performance whereas a traditional database uses the same tabular format regardless of the query.

For example, if a query that you have stored uses the whole table, it's more efficient to use a column orientated database.

Or if you only ever do simple joins between two tables, such as `select people.firstname, orders.name from people join orders on people.id = orders.person_id where people.firstname = 'chronological'`, you could store orders by person.

or if you have a query that is like select people from orders where name = "chronological" you can have a file per person on disk.

FTR, Postgres is efficient enough with its tabular database, but I think this idea has some merit.

chronological, Apr 11 2020

I think this (materialized views), plus some triggers*, will get you that. https://en.wikipedi...i/Materialized_view
*I mean some relatively simple triggers - with arbitrarily complex database triggers you can do anything, but probably shouldn't. [pertinax, Apr 12 2020]

Putt's Law https://en.wikipedi...ccessful_Technocrat
Relevant to user-maintained systems [8th of 7, Apr 21 2020]


       So, adaptive organizational structure ? Bit of a pain on a large DB if it starts to thrash.
FlyingToaster, Apr 11 2020

       The queries are set in advance so the database knows how to lay out the data.   

       If queries need to be changed then yes, that would cause thrashing as data has potentially to be rewritten into a new format.
chronological, Apr 11 2020

       That's what's going to make it noncommercial. In commercial environments, where the money's made, inherent flexibility and the capacity to make changes quickly and cheaply are primary requirements.   

       The idea assumes a static and tighty-bounded system, which are rare in the real world.   

       Many users can easily generate a new query, perhaps based on an existing query. Redesigning a database requires constant SysAdmin input, downtime, and the potential to introduce errors and/or lose data.   

       Businesses won't use such a system, hence while it is technically possible, and interesting (though such concepts have been raised in academia since the '60s) it's not viable as a product.
8th of 7, Apr 12 2020

       pertinax Materialized views are still based on the tabular data format that the database uses.
chronological, Apr 12 2020

       You might as well just use a spreadsheet. Lots of _lusers do that.
8th of 7, Apr 12 2020

       I really think there should be a spreadsheet backed by a Postgres database. It would be fast.
chronological, Apr 12 2020

       How fast does a spreadsheet need to be ? They're a clunky and inefficient way of manipulating large data sets (tho very useful for small ones) and at small sizes they're acceptably fast even on modestly powerful systems.
8th of 7, Apr 12 2020

       I want a fusion of databases and spreadsheets.   

       The ease of defining relations with a spreadsheet and formulas.   

       The performance of a database.   

       Would be awesome. Imagine an order system implemented this way.
chronological, Apr 12 2020

       ... and then let the users maintain it ?   

       Have you ever tried hypnotic regression ? You should. You really should. We guess that in a past life, you were a Kamikaze pilot ...
8th of 7, Apr 12 2020

       //a fusion of databases and spreadsheets//   

       Devices exist to fuse the two.   

       The resulting "solutions" tend to produce maintenance nightmares.
pertinax, Apr 12 2020

       // tend to //   

       Sp. "always".   

       Unless it's a single-user design where the designer, user and maintainer are the same individual, it's like juggling running chainsaws; fascinating to watch, because it's only ever a matter of time before it goes horribly wrong ...
8th of 7, Apr 13 2020

       That's true, but the administrators will inevitably tend to create the queries that they perceive as most useful to them, rather than those that the users perceive as important*.   

       The Admins want to keep the database running well; that is their goal. They will subordinate that to other tasks, since if the database degrades or fails they will be criticised. If the users merely get slow service, or they query they want is not created for them in an acceptable timeframe, the criticism will be within accepted bounds and can be deflected by resort to the List of Standard Excuses.   

       *Since users are not in fact important unless they have sufficient authority to make the Admins lives unpleasant, they are by definition unimportant. Better, those who have such authority almost always lack the technical understanding of the issues and can almost always be fobbed off with a bit of carefully considered Technobabble.
8th of 7, Apr 14 2020

       // Sp. "always".  //   

       Actually, no; there is a sweet spot. It can be found where
1. there are reasonably intelligent, responsible "power users" taking responsibility for the spreadsheets themselves, (and not the databases or the "fusion" mechanism) and
2. (this is important) the data-flow is almost entirely one- way, viz., from database to spreadsheet, and *not* vice versa. Given this "look, don't touch" limitation, you can get away with software engineering practices which would be disastrous otherwise.
pertinax, Apr 20 2020

       // you can get away with software engineering practices which would be disastrous otherwise. //   

       That sounds like the strategy of someone who already has a totally reliable exit plan from the organization, through retirement, promotion, or departure for a better post elsewhere...
8th of 7, Apr 21 2020

       Spot on, [8th]; I did tiptoe away from that organisation about ten years ago.
pertinax, Apr 21 2020

       And were no doubt sufficiently distanced from it not to get spattered when the excrement encountered the air-handling system ?   

       See also Putt's Law <link>
8th of 7, Apr 21 2020

       Actually, the excrement never did hit the air-handling system.   

       The engineering practices *would have been* disastrous with a different solution architecture but, with the actual solution architecture, they were OK.
pertinax, Apr 21 2020

       <Peers into Crystal Ball/>   

       We see... a small, tightly knit team of capable and motivated technical staff, and a management with just enough wit to realize that although they don't understand the issues, the team does, and the best and safest policy is not to interfere.   

       Close ?   

       Genuine conversation (not a copy from Dilbert):

       "Do we have a solution for this, then ?"


       "So what is it ?"

       "I'm not going to tell you."

       "But I'm the manager. Why not ?"

       "If I tell you, it might encourage you to make decisions in a futile attempt to modify the outcome of events, which is unacceptable. Just sign this requisition. The problem can then be considered solved."

       "Oh God...."

       (He was, by and large, a very good manager).
8th of 7, Apr 21 2020

       // Close ? //   

       No, but bless you for trying.   

       When did you last have that ball serviced?
pertinax, Apr 22 2020

       This halfbakery idea is not a fusion of spreadsheets and databases. That was an offhand idea.
chronological, Apr 22 2020

       So, the core idea was for proprietary file formats, optimized for a limited set of use cases only?
pertinax, Apr 22 2020

       > So, the core idea was for proprietary file formats, optimized for a limited set of use cases only? — pertinax   

       Kind of. For a given query, there is a data structure that is optimised for answering that question.   

       You could model Halfbakery as a set of tables in a SQL database. Flat files and directories or you could have objects in memory.
chronological, Apr 24 2020

       I think proprietary file formats for limited use cases are WKTE. As the app develops, you can waste a lot of effort reinventing the wheels of referential and transactional integrity - and, if you fail to reinvent those wheels correctly, you get some nasty bugs.
pertinax, Apr 24 2020

       There isn't a database that exists that accepts queries and uses them to schedule data storage format.
chronological, Apr 24 2020

       So, the idea is now for an application which re-configures its own persistence layer dynamically in response to queries?
pertinax, Apr 24 2020

       Correct, except you load the queries up in advance of running them, indeed before you put the data in the database.   

       So you have a database that is fast for the queries that you load it with.
chronological, Apr 24 2020

       That sounds very much like the 4GLs that were popular in the early 90s.
8th of 7, Apr 24 2020

       I imagine providing a histogram with your queries.   

       If most of your queries are a simple join, you can optimise for that case in your data model.
chronological, Apr 24 2020

       Fine as a teaching aid, but real-world situations are much more complex, hence the reason databases are the way they are.
8th of 7, Apr 24 2020


back: main index

business  computer  culture  fashion  food  halfbakery  home  other  product  public  science  sport  vehicle