IPFire 2.25 - Core Update 151 has been released. It comes with various package updates and a number of bug fixes in IPFire Location and security improvements in the SSH service.

Please support our project with your donation.

Improvements to IPFire Location

Since the rollout of our new location database, we have made various improvements on the software implementation to increase accuracy and speed. These are now all included in this Core Update.

In addition to that, we now show whether an IP address is marked as an "anonymous proxy", "satellite provider" or "anycast" which helps debugging network issues and investigating attacks.

Misc.

  • OpenSSH has been updated and no longer supports using SHA1 in the key exchange. Some outdated clients might not be able to connect to the IPFire SSH console any more. Please update your SSH client if you are encountering any problems.
  • A bug has been fixed that IPsec connections where not properly shut down when deleted on the web user interface. The connection was often re-established before it has been removed from the IPsec configuration which could keep it active until the next reboot.
  • Marcel Follert has contributed two new packages: ncdu - a graphical disk usage monitor, and lshw - a tool that shows installed hardware to the system
  • Updated packages: binutils 2.35.1, boost 1.71.0, cmake 3.18.3, dhcpcd 9.1.4, fontconfig 2.13.1, freetype 2.10.2, iptables 1.8.5, knot 3.0.0, lcms2 2.9, libgcrypt 1.8.6, libidn 1.36, libloc 0.9.4, libnetfilter_conntrack 1.0.8, libnetfilter_queue 1.0.5, lmdb 0.9.24, logwatch 7.5.4, openjpeg 2.3.1, openssl 1.1.1h, poppler 0.89.0, qpdf 10.0.1, strongswan 5.9.0
  • Various Perl modules have been updated by Matthias Fischer: Digest::SHA1 2.13, Digest::HMAC 1.03, Net::DNS 1.25, Net::SSLeay 1.88

Add-ons

  • Updated packages: avahi 0.8, bacula 9.6.6, cups 2.3.3, cups-filters 1.27.4, dnsdist 1.5.1, freeradius 3.0.21, Git 2.28.0, guardian, haproxy 2.2.4, iptraf-ng 1.2.1, keepalived 2.1.5, libmicrohttpd 0.9.71, libsolv 0.7.14, lynis 3.0.0, nginx 1.19.2, stunnel 5.56

Thanks to everyone who contributed to this update with either submitting patches or helping us testing it.


This is a more in detail article about how libloc works internally. This might be slightly too tech-savvy for some readers, but it might still be a fun read if you would like to know more about the challenges and implementation of IPFire Location.

When we started the project, it was immediately clear that the biggest challenge would be packing the data into the database efficiently so that it consumes as little space as possible and - at the same time - can be read as quickly as possible. This is required to make the library as versatile as possible and enable applications that we are not aware of yet (because you can never be too fast) and to scale down to the smallest systems that IPFire runs on.

The internet is a big space. Four billion IPv4 addresses is nothing. The IPv6 address space is large, and so are the addresses. 128 bits are 16 bytes. Storing the full address for the already allocated address space would already be huge, but as the internet continues to grow at a fast pace, the database would very soon become bigger and bigger.

Storing it is not the only problem: The larger the database is on disk, the longer it takes to search through it. Loading it all into memory is no longer becoming an option and so this all results in performance problems.

Looking at applications

Many target applications for our library need to be fast. Nobody likes to wait for a website to load, but more importantly, applications like an Intrusion Prevention System need to be able to handle a large number of packets a second. If a source IP address needs to be classified by using libloc, this can only take a couple of nanoseconds. A millisecond would already cause a performance impact and connections would take too long to establish.

That is why the goal was that the database needed to be searchable in O(1) - or for those who are not familiar with Landau symbols: every search should take the same time, no matter where in the database the object is stored.

This becomes very clear if you imagine the database being organised like a spreadsheet. All networks are listed and when you search you will go through the spreadsheet line by line until you have found what you are looking for. However, we need the closest match and therefore need to look at all networks on the spreadsheet.

Currently, the database has just under one million entries which means comparing one million subnets to the IP address we are looking up. In Landau notation this would be O(n). If another network will be added it will make the search longer and the algorithm will become slower and slower the bigger the database gets.

Generic search algorithms very often cannot be optimised any further than this. But in our case, we know more about our data: If we ordered the spreadsheet by network, we know that we can stop searching when we passed the first network with a start address larger than what we are searching for.

That will mean that most likely we do not have to search through the whole spreadsheet, but it would still mean that in the worst case we have to search the whole document since the element we are looking for might come last.

IP addresses are just bits. IPv6 addresses are 128 bits and IPv4 addresses are 32 bits long. If we would simply note each network as an array of bits (1010011001... and so on) we could write it as a binary tree:

Binary Tree
From Wikipedia

Starting from the root node of the tree, we would "turn left" if the first bit is zero and "turn right" when the first bit is one. If we do that for each bit we walk down the tree until we have reached its end.

This operation - in our case - takes constant time. Since there are never more than 128 bits in an IPv6 address, we can only walk 128 steps. That is a lot shorter than searching through one million of entries. Brilliant!

We are not only storing IP addresses in the database. We are storing networks; for example: 2001:db8::/48. The number after the slash denotes that only the first 48 bits are relevant and the rest is the host part of the IP address. So that means we do not even to care about anything after the first 48 bits. This is even better because our tree is getting shorter and therefore smaller.

That means that we can also decide how far we want to walk down the tree. At a higher leaf might be a larger subnet that was allocated to an internationally operating ISP which has split this network further and assigned one smaller subnet for each country they are operating in. There is a lot of flexibility.

We make it fit "Squashing Data isn't always easy"

Another advantage of binary trees is that they are sorted. If we want to list all networks belonging to a certain country, this becomes relevant.

But how do we know a network belongs to a certain country? At every leaf, there is a pointer stored to an array which carries all the remaining information - or to continue with the spreadsheet example: the other columns.

In our case they contain the country code (like DE) and the AS number if available. On top of that there are a couple of flags which can be set to mark a network as part of a satellite network, anonymous proxy or anycast.

All this is only a few megabytes for millions of entries. Therefore we have very little overhead and we did not have to compromise and potentially drop any data to make it fit.

Autonomous Systems

Storing those is a lot cheaper since we can simply order them and write them one after the other. Ordering them allows us performing binary search on the array to look them up very quickly. They are also searchable by name, which comes without an index.

The String Pool

Since Autonomous Systems come with a name, those need to be stored somewhere. We wanted to avoid any duplication to keep the database as small as possible and therefore decided to build a string pool which stores each string once. That way, we can refer to a place in the pool where this string is being stored and all other objects only store pointers to those locations which allow us to make all objects fixed size and not waste a byte.

Since the whole database is simply statically sized objects for each different object type (networks, autonomous systems, etc.) they can be read and parsed very quickly which again accelerates the search.

Endianness

The database is of course built in a way that it can be read and written on all possible operating systems and we follow the general concept to encode it in "network byte order" or big endian.

Signatures

We garnish all of this with a builtin signature. Because we are using the content of this database for security-sensitive things - firewall rules - we need to make sure that nobody is sending us a faked version of the database with no data in it. That would then result in no firewall rules being created and hackers having a free go.

To avoid carrying a signature hash around separately, we built it into the database file. That way this is a lot easier to handle and as a user of libloc not even to worry about.

This was also one of the main reasons to build this whole project and distinguishes us from our competitors.

Smart Updates

Finally, this database needs to be updated on a regular basis. Although the database is very small, it still is a couple of megabytes and those downloaded a couple of times a month adds up. Also do not forget that we have to upload the data each IPFire system is downloading and therefore this is a lot of terabytes.

The system therefore checks for the latest database version using DNS. A text record simply stores the timestamp of the latest version. If that is newer than the current database, the system will try to download it.

To avoid any tempering with that, we use DNSSEC to validate it and we have some nice tricks to avoid re-downloading any data from outdated mirrors.

That way, only a couple of bytes are exchanged to check if an update is needed. We can run it once an hour which means that any updates will be rolled out to every user very quickly.

Standalone

Our library is written from scratch and does not depend on any third-party code with exception of the standard C library and OpenSSL for the cryptographic functionality. That makes it easily portable and small to be integrated in many other projects, too.

There is more detail on how to use all this on either the command line or in your own application on the official website.

We Are Proud

... to have been working on such an interesting project and that we gave ourselves the opportunity to put so much attention to every detail. We believe that our work is very vital to the internet community and that we are ready to challenge the status quo and provide a rock solid, easy to use and versatile and fast solution.

I could go on about the many features and little details which we spent so much time on getting just right.

Please support our work on this and other features all around IPFire with a donation. They won't be possible without your help! Head over to www.ipfire.org/donate to donate now.


This is the official announcement for the release of IPFire 2.25 - Core Update 148 - an update I have personally been waiting for: We finally roll out replacing Maxmind's GeoIP database by our own improved implementation.

IPFire Location

As we have already pre-announced some time ago this side-project inside the IPFire Project is finally ready for prime time.

It comes with a new implementation to build, organise and access a highly optimised database packages with loads of helpful data for our firewall engines, as well as our analytics to analyse where attacks against the firewall are originating from.

With it, IPFire can block attackers from certain countries, or do the opposite - only permit access to certain servers from certain places. Combining rules with the rate-limiting feature allows to limit connections from certain locations which is very helpful for DoS attacks.

No new features have been added, but those that we had have been massively improved. The database is now being updated once a week which makes it more accurate and we no longer require complicated scripts to convert it into different formats to be used in different parts of the operating system.

Instead the database can be opened and ready extremely quickly which allows access in realtime making pages on the web user interface load significantly faster.

We hope that many other projects choose to use our implementation as well, since we have chosen a truly open license for the data as well as the library that works behind it.


In the last couple of months, we, the IPFire development team, have launched a small side project: A new location database for the Internet. In this article, I would like to give you a brief background story on why and how it come to this...

What is this?

I am sure that you all have used a location database - often called GeoIP database after a brand name from a company called Maxmind. Most likely that was in an online shop that showed you shipping cost based on your location, or you were shown a cookie warning when you visited a website where the EU's cookie guidelines applied.

Other applications would be threat prevention like we use it in IPFire. Connection attempts from certain countries can simply be blocked, or port forwardings can be limited to certain countries only.

That is, however, not an exact science. The Internet changes constantly. IP address ranges are re-assigned from one party to another one, and often it can take some time until those location databases are all updated. Up to that point, you will see wrong information like the Google front page being shown in a wrong language. This might only be a bit of an inconvenience, but for a firewall, we need more recent and reliable data.

Maxmind is the biggest player on the market, and the previous source for GeoIP data in IPFire. They are a Massachusetts-based company and recently changed their terms of their database which was available under a Creative Commons license before. Now, users are required to register before they are permitted to use the database. Although the company claims their database is still free, it is at least a very grey area from our point of view, and since, we have decided to no longer use them. Currently, IPFire ships the last version of the database before registration was required and we did not accept the new end-user license agreement.

Accuracy Issues

Development of our own successor has started long before that, because we have already become more and more unhappy with the accuracy of Maxmind's free data. Potentially it is deliberately made inaccurate to promote paid services. Unfortunately we or any of their customers have no insight on where the data is coming from and how the database is composed.

Since for us, this is security-relevant, we needed these problems fixed.

Most importantly, the data needed to be accurate. We do not care about geo coordinates, or a county or city, but only a country. It isn't really possible to divide the Internet into countries, but what is possible is to have an idea from what jurisdiction someone is accessing a website. For most people, that is enough accuracy.

We also wanted to know from which Autonomous System a user is, because that is the only thing the Internet can be divided into. It is an inter-connected network of autonomous systems and that carries valuable information for us. For example to identify cloud providers.

On top of that, it is often interesting to have other attributes. There are plenty of anonymous proxies out there, that are being used for users to hide. Maxmind is using special country codes (in this case A1) to mark those, but unfortunately that loses the actual country this system is located in.

We added an extra set of attributes that can be used to flag certain networks for various reasons allowing to gather more information without trading in accuracy and use them to mark satellite providers, anonymous proxies and anycast networks.

Finally, we needed to be sure that the database is recent and not modified by a third party. That is something that our competitors do not do. We have instead built a cryptographic signature into the database, so that when it is being downloaded to you local IPFire system, you can be sure that it is coming from us and has not been tampered with before loading it into your firewall.

I will blog more on the technical solutions and challenges in a later post.

Problems solved

So, we are now close to release version 1.0 of what we have built: An always up-to-date location database, that brings you more and accurate data.

We see it as an independent project within the IPFire Project, because not only we can greatly benefit from this piece of software: DNS load-balancers that will steer users to their closest data center, online shops that need to comply with different legal requirements, and many more...

We have implemented it as a C library with a very small footprint and OpenSSL as its only dependency. We then added Python and Perl modules. That way, it can be easily integrated into other software and of course we expect other people to contribute bindings for other scripting languages, etc.

There are download scripts that regularly update the database and use some smart ways to avoid transferring any unnecessary data.

All this is now in its final stages of testing, and you can use it in the latest testing release of IPFire. If you are interested in contributing by either reporting bugs, add language bindings or help making the database more accurate, please join our location mailing list.

More information and a live demo can be found on location.ipfire.org.


This is an update I have personally been waiting for a long time: We finally roll out replacing Maxmind's GeoIP database by our own improved implementation.

IPFire Location

As we have already pre-announced some time ago this side-project inside the IPFire Project is finally ready for prime time.

It comes with a new implementation to build, organise and access a highly optimised database packages with loads of helpful data for our firewall engines, as well as our analytics to analyse where attacks against the firewall are originating from.

With it, IPFire can block attackers from certain countries, or do the opposite - only permit access to certain servers from certain places. Combining rules with the rate-limiting feature allows to limit connections from certain locations which is very helpful for DoS attacks.

No new features have been added, but those that we had have been massively improved. The database is now being updated once a week which makes it more accurate and we no longer require complicated scripts to convert it into different formats to be used in different parts of the operating system.

Instead the database can be opened and ready extremely quickly which allows access in realtime making pages on the web user interface load significantly faster.

We hope that many other projects choose to use our implementation as well, since we have chosen a truly open license for the data as well as the library that works behind it.

I will talk more about this in a later blog post and explain to you the advantages of libloc.

Please help us testing!

In the meantime, please help us testing this important release and report any issues that you find to the development team to make it the best release of IPFire that we have ever had.

You can also support our work with your donation!


Maxmind, a US-based company who is quite well-known for providing their GeoIP database which fires a lot of services that need GeoIP data, has changed their usage policy on this database with effect of the beginning of this year. Unfortunately this makes it unusable for IPFire and we have decided to replace it. Here is how we are going to do it.

IPFire is using geo information for two things: We are showing flags next to DNS servers, firewall hits, etc. and we are using it to block connections from or to certain countries in the firewall.

We, the IPFire developers, have started a side-project to replace the Maxmind GeoIP databases in IPFire over two years ago. We felt that this was necessary because of the quality of the database getting worse and worse. Strict licences as well as changes like this December are very incompatible with the freedom that we want to provide for all IPFire users.

Introducing libloc

The code name is libloc and it is a library written in C which reads from our own location database.

The code is written in a portable way and runs on multiple operating systems so that it can be used by other projects, too. The library is tiny and the code can quickly be audited. Our focus was on easy usability and performance. Because of smart packing of the data into the database and intelligent search algorithms, we are approximately 10 times faster than Maxmind's code. Pages will load faster and libloc can be used in software where location information needs to be present as quickly as possible - for example in the Intrusion Prevention System or in a DNS server that performs load-balancing based on the geographical location of the user. With provided bindings for Python and Perl, it is easy to use in scripting languages, too.

To make sure that you are only using genuine data, the database is cryptographically signed and being automatically updated whenever needed.

It is a really awesome project and many hours of engineering work have been put into it. It is software design at its finest and I had a lot of fun working on the project.

The Changes For Now

Sadly, this project is not yet ready for production and so this is a slightly hurried announcement. Of course you can support us with your donation. Keep watching this blog for any further updates. But so far, here are the most important things:

If you install a new IPFire system with a release version before 2.23 - Core Update 140, you won't be able to use geo blocking. The reason is that Maxmind's database is not being shipped with IPFire because it was unclear if we could do that legally or not. A script regularly updated the database, but this service has now been deactivated by Maxmind.

With Core Update 140 we ship the last version of the database that is available under the old Creative Commons licence. Now, Maxmind requires to sign a new licence which we cannot do for various reasons and therefore we are looking to retire using this database altogether and use libloc.

Those changes will come with one of the following update. The code is already done and in a very good beta stage. What is not yet fully finished, is the actual database. We are writing and optimising scripts that gather the information we need and compile it. This is what we are working on right now and hopefully it won't be long.