gotqn avatar image
gotqn asked

Storing IP addresses - varchar(45) vs varbinary(16)

I am going to create a table with two fields - `ID` as `BIGINT` and `IPAddress` as either `varchar(45)` or `varbinary(16)`. The idea is to store all unique IP addresses and use a reference ID instead the actual IP address in other tables. I have already written SQL CLR functions for transforming IP address bytes to string and the reverse and need to choose the best type for storing the IP addresses - `varchar(45)` or `varbinary(16)`. By best I mean that the following should be executed fast as possible: 1. a stored procedure will insert new IP address in the table if the IP address does not exist 2. the procedure will return the ID if such IP address exists I can tell, how many different IP address I am going to collect but they will be more, so I need to choose the column which will provide better performance. Also, I am planning to include the value as a clustered index column.
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

What, no IPv6 compatibility? Seriously, you need to design this in now.
2 Likes 2 ·
Just to clarify, the ID column is going to be the clustering key or the IPAddress?
0 Likes 0 ·
I think first the `ID` to be the clustered key column, and maybe to create other index on `IPAddress` only, but I am open to suggestions to test.
0 Likes 0 ·
What? There will be IPv6 compatibility - that's why the fields are so big.
0 Likes 0 ·

1 Answer

Magnus Ahlkvist avatar image
Magnus Ahlkvist answered
Based on your two suggested datatypes, I'm guessing you will store both IPv4 and IPv6 address in the same column? I wouldn't design it that way - I prefer to store one value per column, not two, as you can't be sure a client will have the same IPv4/IPv6-combination over time. Anyway, about the storage column, it depends on basically two: - How fast are your CLR-functions? - How often will you access the actual IP-address(es)? Since your functions won't require data access and they are most certainly deterministic (the later will have to be specifically set for the SQL function definition, default value is False), you will still be able to get parallell query execution, and therefore I would suggest varbinary storage (based on CPU being faster than disk I/O in general). But don't hold that against me if it later turns out that varchar(45) would have been faster - it still depends on the performance of the function, how often the conversion is made, how likely it is that the whole (or most of the) table fits in memory cache on the server etc.
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.