question

TimothyAWiseman avatar image
TimothyAWiseman asked

Mathematics for the DBA

I have recently started reading Applied Mathematics for Database Professionals by Lex de Haan, and it is so far a very interesting book, but despite the title seems more theoretical than applied so far. But is there a real need for a DBA to have a deep understanding of mathematics? If there is, what other good resources are there for this?
theorymathematics
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Magnus Ahlkvist avatar image
Magnus Ahlkvist answered
I'd say... ***drumroll***... It depends. It depends a lot on what you call mathemathics. When I studied "Algorithms and datastructures" we did learn a lot about efficiency calculations. It was officially a course in Computer Science, but in reality it was very mathematical. At my first university, Automata Theory was considered mathemathics, at other universities the same course would have been a computer science course. Generally, I'd say at least a basic understanding of Algebra and Logics is a must for any database professional. Depending on what you put into the profession "DBA" you might or might not need knowledge about algoritm efficiency. Not a deep knowledge, but you need to intuitively know which is the more efficient of two algorithms if you're going to study and understand execution plans. Finally: You always need to know about mathematics. It makes life easier, regardless of you profession. But that goes for a lot of academic disciplines, so if you're going to follow the rule "learn whatever you will later find useful" you would study a lot and work very little.
5 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Scot Hauder avatar image Scot Hauder commented ·
I think you more eloquently explained my point of view, of course you need basic math skills to function day-to-day. How would you calculate storage requirements without it, but are you doing any heavy lifting with mathematics on a regular basis--probably not (pun not intended) Your ability to think on your feet and function in high-pressure situations where the C-level staff is looking over your shoulder while you get the mission-critical production system back on-line is more valuable than anything you are going to learn from a book on theory
0 Likes 0 ·
David 1 avatar image David 1 commented ·
@Scot: Understanding the basic theory is what helps you get the correct results from your databases. I would suggest that is MORE valuable, not less valuable than getting a "mission-critical production system back on-line". The latter is merely a tactical, product-specific skill that will be relearnt many times during a career. Database principles on the other hand are fundamentally useful knowledge that will probably endure for at least another generation.
0 Likes 0 ·
Scot Hauder avatar image Scot Hauder commented ·
Ok. basic theory of what helps get the correct results? I agree database principles are important, but how are they mathematically intensive?
0 Likes 0 ·
David 1 avatar image David 1 commented ·
For example dependency theory. Understanding join dependency and how to analyse dependencies will help you eliminate redundancy in database design. Knowing about the relational algebra and calculus will help you understand problems, communicate your ideas and understand the solutions produced by others. How much that requires a "deep" knowledge of math is subjective I suppose. For example I would say that some of the material in the Alice Book (Abiteboul et al) is pretty mathematically intensive - but maybe that's just because my own maths isn't so good!
0 Likes 0 ·
TimothyAWiseman avatar image TimothyAWiseman commented ·
You make good points, the line between theoretical computer science and mathematics is a very blurry one indeed. Holding both a math degree and going through the book I mentioned right now, I obviously think there is value in knowing mathematics, but I was very interested in seeing if that was the consensus view or if I was in a minority. From the answers and the upvotes here, it looks like most people think you need to know math to be a good DBA, but the exact types and amounts seemed to be highly debated and largely sensitive to the individuals exact position.
0 Likes 0 ·
Scot Hauder avatar image
Scot Hauder answered
In general I would say no. If your business users are in the financial/tech/science industry then some of their reporting and modeling needs might require a higher understanding of the math. If you are responsible for the data-mining models or machine-learning models then probably yes.
2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
@Scot Hauder - not relevant here, but was wondering - given that you're the user here with most rep on Ask.SQLTeam - do you have any thoughts on the 'how best to merge' question?
0 Likes 0 ·
Scot Hauder avatar image Scot Hauder commented ·
@Matt - I liked your idea on having a parallel site to get the data and tags into the right format before merging. I can foresee some dupes as I've noticed many post the same question on both sites (shotgun approach I suppose) There were only a handful of us active over there so you could probably do a good hatchet job and not upset anyone.
0 Likes 0 ·
Matt Whitfield avatar image
Matt Whitfield answered
I'd say it's essential. An understanding of set theory, an particularly algorithmic maths is absolutely invaluable. For example, understanding how hash tables work, how indexes really work - they're not things that come naturally without an understanding of the basics.
3 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Scot Hauder avatar image Scot Hauder commented ·
Absolutely, I agree you need to understand the basics. I haven't read the book Timothy cites but figured he was wondering about higher mathematics
0 Likes 0 ·
Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
@Scot - well, when I did A-levels I took 3 pure maths modules, 2 statistics modules and 1 decision maths module. I have used the pure maths twice (once to solve a quadratic equation that makes the tooltips in my apps appear as close to the golden ratio as possible, and once for my as-yet-unreleased data exploration app). However, decision maths I use *all* the time. Stats - well, I haven't used that very much that's for sure!
0 Likes 0 ·
Scot Hauder avatar image Scot Hauder commented ·
@Matt from a developer perspective, which I know you do a lot of, we definitely run into these problems. If you want to do more statistics get into data-mining algorithms. Lots of probability and integrals!
0 Likes 0 ·
ThomasRushton avatar image
ThomasRushton answered
I would say yes. Set theory (obviously), and an understanding of probability is necessary. For example, how do you know how many database restores from your thousand databases do you need to do in order to be, say, 95% confident that they all work? See Thomas "Rockstar" LaRock's article [Statistical Sampling for Verifying Database Backups][1] and if you fully understand the maths behind it, then you're a better man than me. [1]: http://thomaslarock.com/2010/05/statistical-sampling-for-verifying-database-backups/
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

TimothyAWiseman avatar image TimothyAWiseman commented ·
Thanks for the link,, it was very enlightening. Sometime soon I need to learn more statistics.
0 Likes 0 ·
David 1 avatar image
David 1 answered
Yes. Relational database design is based in logic and mathematics. So are the building blocks of database queries. So those mathematical concepts are the toolkit you can use to solve database problems. Knowledge of the underlying principles is particularly relevant to SQL because SQL is a rather flawed attempt to imitate a relational database language. If you truly understand the relational foundations on which SQL was supposed to be based then you'll be more likely to avoid some of the pitfalls of working with SQL.
3 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Scot Hauder avatar image Scot Hauder commented ·
Interesting, do you short all of your investments too?
2 Likes 2 ·
Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
I'm sorry - but I have to ask - How come you've only voted 14 times, with 5 of those times to vote someone down?
0 Likes 0 ·
David 1 avatar image David 1 commented ·
Maybe because voting down bad answers is more important than voting up good ones. And there just aren't that many really bad answers on here.
0 Likes 0 ·
Blackhawk-17 avatar image
Blackhawk-17 answered
I've read the book and it is theory laden. Some of it takes second readings to wrap your head around the authors' mindset. The title, Applied Mathematics for Database Professionals, sums it up - dB PROFESSIONALS. In this case I choose to view professionals as individuals determined to know their craft and focused on continuous improvement. de Haan and Koppelaars provide an excellent treatise on Set Theory with a goal of using it to perform dB design and, as it says on the cover, "communicate precisely about those designs with other stakeholders.". This is the sort of book that can take an accidental DBA from thinking of SQL as merely a series of Excel rows and columns to seeing just how complicated - yet simple - the reality is. As for the average dB Pro requiring Hawkins-level mathematics skill... generally - no.
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Jesse McLain avatar image
Jesse McLain answered
I wrote an [algorithm to determine natural keys on raw data][1], and the math behind [combinatorics][2] certainly comes into play for estimating performance. If I want to find all possible unique keys on a table of 25 columns, setting a max key size to 3 columns, there are C(25, 3) + C(25, 2) + C(25, 1) = 2625 possible keys ("C(25, 3)" means "out of 25 columns, pick 3"). If I allow my algorithm to search that table for a key of length up to 25 columns, there are over 33 million possibilities to query (no one would do that, but it's still good to know how much that would cost). [1]: http://www.sqlservercentral.com/scripts/T-SQL/62086/ [2]: http://en.wikipedia.org/wiki/Combination
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Mark avatar image
Mark answered
Does the book cover statistics? I was interested in [this][1] SQL program that uses Linear Regression. [1]: http://www.sqlservercentral.com/articles/T-SQL/69334/
3 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Blackhawk-17 avatar image Blackhawk-17 commented ·
The book is a generic methodology at looking at database design and business rules (constraints etc.) from a mathematical (Set Theory/Logic) approach. It isn't a series of solutions or, despite the title, advanced mathematics with respect to reporting on data.
1 Like 1 ·
Mark avatar image Mark commented ·
OK, but at any rate that article is an example of applied mathmatics using SQL.
0 Likes 0 ·
Blackhawk-17 avatar image Blackhawk-17 commented ·
@Mark - I agree. There will be some areas where a developer requires deep mathematics, and finding a resource or two for that wouldn't hurt - I was just circling around to the fact that the book doesn't deal with that.
0 Likes 0 ·
Cyborg avatar image
Cyborg answered
SET Theory, Linear Programming - is a mathematical method for determining a way to achieve the best outcome , Genetic Algorithm for heuristic Search etc.
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.