I have recently started reading Applied Mathematics for Database Professionals by Lex de Haan, and it is so far a very interesting book, but despite the title seems more theoretical than applied so far. But is there a real need for a DBA to have a deep understanding of mathematics? If there is, what other good resources are there for this?
I'd say... ***drumroll***... It depends. It depends a lot on what you call mathemathics. When I studied "Algorithms and datastructures" we did learn a lot about efficiency calculations. It was officially a course in Computer Science, but in reality it was very mathematical. At my first university, Automata Theory was considered mathemathics, at other universities the same course would have been a computer science course. Generally, I'd say at least a basic understanding of Algebra and Logics is a must for any database professional. Depending on what you put into the profession "DBA" you might or might not need knowledge about algoritm efficiency. Not a deep knowledge, but you need to intuitively know which is the more efficient of two algorithms if you're going to study and understand execution plans. Finally: You always need to know about mathematics. It makes life easier, regardless of you profession. But that goes for a lot of academic disciplines, so if you're going to follow the rule "learn whatever you will later find useful" you would study a lot and work very little.
In general I would say no. If your business users are in the financial/tech/science industry then some of their reporting and modeling needs might require a higher understanding of the math. If you are responsible for the data-mining models or machine-learning models then probably yes.
I'd say it's essential. An understanding of set theory, an particularly algorithmic maths is absolutely invaluable. For example, understanding how hash tables work, how indexes really work - they're not things that come naturally without an understanding of the basics.
I would say yes. Set theory (obviously), and an understanding of probability is necessary. For example, how do you know how many database restores from your thousand databases do you need to do in order to be, say, 95% confident that they all work? See Thomas "Rockstar" LaRock's article [Statistical Sampling for Verifying Database Backups] and if you fully understand the maths behind it, then you're a better man than me. :
Yes. Relational database design is based in logic and mathematics. So are the building blocks of database queries. So those mathematical concepts are the toolkit you can use to solve database problems. Knowledge of the underlying principles is particularly relevant to SQL because SQL is a rather flawed attempt to imitate a relational database language. If you truly understand the relational foundations on which SQL was supposed to be based then you'll be more likely to avoid some of the pitfalls of working with SQL.
I've read the book and it is theory laden. Some of it takes second readings to wrap your head around the authors' mindset. The title, Applied Mathematics for Database Professionals, sums it up - dB PROFESSIONALS. In this case I choose to view professionals as individuals determined to know their craft and focused on continuous improvement. de Haan and Koppelaars provide an excellent treatise on Set Theory with a goal of using it to perform dB design and, as it says on the cover, "communicate precisely about those designs with other stakeholders.". This is the sort of book that can take an accidental DBA from thinking of SQL as merely a series of Excel rows and columns to seeing just how complicated - yet simple - the reality is. As for the average dB Pro requiring Hawkins-level mathematics skill... generally - no.
I wrote an [algorithm to determine natural keys on raw data], and the math behind [combinatorics] certainly comes into play for estimating performance. If I want to find all possible unique keys on a table of 25 columns, setting a max key size to 3 columns, there are C(25, 3) + C(25, 2) + C(25, 1) = 2625 possible keys ("C(25, 3)" means "out of 25 columns, pick 3"). If I allow my algorithm to search that table for a key of length up to 25 columns, there are over 33 million possibilities to query (no one would do that, but it's still good to know how much that would cost). :