I’m trying to build up a push system recently. To increase the scalability of the system, the best practice is to make each connection as stateless as possible. Therefore when bottleneck appears, the capacity of the whole system can be easily expanded by adding more machines. Speaking of load balancing and reverse proxying, Nginx is probably the most famous and acknowledged one. However, TCP proxying is a rather recent thing. Nginx introduced TCP load balancing and reverse proxying from v1.9, which is released in late May this year with a lot of missing features. On the other hand, HAProxy, as the pioneer of TCP loading balacing, is rather mature and stable. I chose to use HAProxy to build up the system and eventually I reached a result of 300k concurrent tcp socket connections. I could have achieved a higher number if it were not for my rather outdated client PC.
Step 1. Tuning the Linux system
300k concurrent connection is not a easy job for even the high end server PC. To begin with, we need to tune the linux kernel configuration to make the most use of our server.
File Descriptors
Since sockets are considered equivalent to files from the system perspective, the default file descriptors limit is rather small for our 300k target. Modify /etc/sysctl.conf
to add the following lines:
fs.file-max = 10000000 fs.nr_open = 10000000
These lines increase the total file descriptors’ number to 1 million.
Next, modify /etc/security/limits.conf
to add the following lines:
* soft nofile 10000000 * hard nofile 10000000 root soft nofile 10000000 root hard nofile 10000000
If you are a non-root user, the first two lines should do the job. However, if you are running HAProxy as root user, you need to claim that for root user explicitly.
TCP Buffer
Holding such a huge number of connections costs a lot of memory. To reduce memory use, modify /etc/sysctl.conf
to add the following lines.
net.ipv4.tcp_mem = 786432 1697152 1945728 net.ipv4.tcp_rmem = 4096 4096 16777216 net.ipv4.tcp_wmem = 4096 4096 16777216
Step 2. Tuning HAProxy
Upon finishing tuning Linux kernel, we need to tune HAProxy to better fit our requirements.
Increase Max Connections
In HAProxy, there is a “max connection cap” both globally and backend specifically. In order to increase the cap, we need to add a line of configuration under the global scope.
maxconn 2000000
Then we add the same line to our backend scope, which makes our backend look like this:
backend pushserver mode tcp balance roundrobin maxconn 2000000
Tuning Timeout
By default, HAProxy will detect dead connections and close inactive ones. However, the default keepalive threshold is too low and when applied to a circumstance where connections have to be kept in a long-pulling way. From my client side, my long socket connection to the push server is always closed by HAProxy as the heartbeat is 4 minutes in my client implementation. Heartbeat that is too frequent is a heavy burden for both client (actually android device) and server. To increase this limit, add the following lines to your backend. By default these numbers are all in milliseconds.
timeout connect 5000 timeout client 50000 timeout server 50000
Configuring Source IP to solve port exhaustion
When you are facing simultaneous 30k connections, you will encounter the problem of “port exhaustion”. It is resulted from the fact that each reverse proxied connection will occupy an available port of a local IP. The default IP range that is available for outgoing connections is around 30k~60k. In other words, we only have 30k ports available for one IP. This is not enough. We can increase this range by modify /etc/sysctl.conf to add the following line.
net.ipv4.ip_local_port_range = 1000 65535
But this does not solve the root problem, we will still run out of ports when the 60k cap is reached.
The ultimate solution to this port exhaustion issue is to increase the number of available IPs. First of all, we bind a new IP to a new virtual network interface.
ifconfig eth0:1 192.168.8.1
This command bind a intranet address to a virtual network interface eth0:1 whose hardware interface is eth0. This command can be executed several times to add arbitrary number of virtual network interfaces. Just remember that the IP should be in the same sub-network of your real application server. In other words, you cannot have any kind of NAT service in your link between HAProxy and application server. Otherwise, this will not work.
Next, we need to config HAProxy to use these fresh IPs. There is a source
command that can be used either in a backend scope or as a argument of server command. In our experiment, the backend scope one doesn’t seem to work, so we chose the argument one. This is how HAProxy config file looks like.
backend mqtt mode tcp balance roundrobin maxconn 2000000 server app1 127.0.0.1:1883 source 192.168.8.1 server app2 127.0.0.1:1883 source 192.168.8.2 server app3 127.0.0.1:1883 source 192.168.8.3 server app4 127.0.0.1:1884 source 192.168.8.4 server app5 127.0.0.1:1884 source 192.168.8.5 server app6 127.0.0.1:1884 source 192.168.8.6
Here is the trick, you need to declare them in multiple entries and give them different app names. If you set the same app name for all four entries, the HAProxy will just not work. If you can have a look at the output of HAProxy status report, you will see that even though these entries has the same backend address, HAProxy still treats them as different apps.
That’s all for the configuration! Now your HAProxy should be able to handle over 300k concurrent TCP connections, just as mine.
I'm not sure about the IP source exhaustion solution:
the "net.ipv4.ip_local_port_range = 1000 65535" tweak makes sense.
This will allow ~60.000 conns targeting a single backend server (having its own IP in a real world szenario).
The next 60.000 conns can target the next backend server (having another than the first backend and so on).
Adding additional IP's to local network interface is only required when targeting a single backend.
Yeah, it's just as you said.
Our backend server has the ability to handle over 60,000 connections, that's why we have to do this to maximize the capability of the back end server.
openwrt 一直修改ulimit,还是运行一段时间之后还是会半死不活的状态,绝望
Hi there,
Thanks for a great tutorial. I studied it twice trying to fix issue we are having with our Chrome Ext. and a PHP Ratchet backend server. Problem is that there is a limit on HAProxy or PHP itself (or Debian?) that limits number of concurrent connections.
We had a PHP Websocket server run on port 8080 and limit of concurrent connections was around 1000 connections (1024?), so we have implemented HAProxy and now its LoadBalance traffic from 8080 to 8081, 8082, 8083 and so on (so we have multiple instances of Websocket server on different ports to handle more clients) … unfortunately after hours or digging around (few of thing from your tutorial were already implemented) and changes of configuration 2000 (2048?) is the highest number we can go!
Do you have any idea what might be wrong? Would you have time to have a look at our setup and infrastructure?
Thanks!
Is your PHP side able to handle 2000 connections?
Note that if you're connecting to 127.0.0.1, you don't need to bind to a "public" address, just use 127.X.Y.Z, they're all yours!
Correct!
I don't understand the significance of this comment:
"Note that if you're connecting to 127.0.0.1, you don't need to bind to a "public" address, just use 127.X.Y.Z, they're all yours!"
Can you explain in more detail?
Thanks for share!
hi
tnx for your Article.
I saw you use the loop back IP Address (127.0.0.1) on backend.
haproxy service and Your APP run on same server?
This is just a demo config. In this demo, yes.
Hi
I use haproxy-1.5.14-3.el7.x86_64 on centos 7.2 whit kernel 3.10.0-327.18.2.el7.x86_64
I set two ip on haproxy server for Example eth0=10.10.10.1 and Virtual interface eth0:1 = 10.10.10.2 and use one backend server whit IP 10.10.10.11
I use “source” on configuretion file on haproxy for send request from two IP Address (eth0=10.10.10.1 and eth0:1=10.10.10.2) to backend side,plz see this config :
backend test
mode tcp
log global
option tcplog
option tcp-check
balance roundrobin
server myapp-A 10.10.10.11:9999 check source 10.10.10.1
server myapp-B 10.10.10.11:9999 check source 10.10.10.2
With this scenario,i get 120k connection on backend side (10.10.10.11) and Everything is ok.
for give more connection I add other backend server for Example 10.10.10.12 , plz see this config :
backend test
mode tcp
log global
option tcplog
option tcp-check
balance roundrobin
server myapp-A 10.10.10.11:9999 check source 10.10.10.1
server myapp-B 10.10.10.11:9999 check source 10.10.10.2
server myapp-C 10.10.10.12:9999 check source 10.10.10.1
server myapp-D 10.10.10.12:9999 check source 10.10.10.2
In this scenario i expected give 120k on Each backend server,But no! On each backend server only give 60k conncetion!
what was wrong?
can you help me?
Tnx
Looks like proxy exhausted its port. You need more IPs for each proxy.
backend mqtt
mode tcp
balance roundrobin
maxconn 2000000
server app1 127.0.0.1:1883 source 192.168.8.1
server app2 127.0.0.1:1883 source 192.168.8.2
server app3 127.0.0.1:1883 source 192.168.8.3
server app4 127.0.0.1:1884 source 192.168.8.4
server app5 127.0.0.1:1884 source 192.168.8.5
server app6 127.0.0.1:1884 source 192.168.8.6
In above configuration, does it mean that we will have two MQTT nodes run on port 1883 and 1884?
Yes. Server should be able to handle requests from both ports.
Setting the hard and soft limits to 10 million like you posted will result in a broken system – this is too much even for our Dell R630's that are running CentOS 6.7 (128GB memory)!
1 million is the maximum that you can set these to – I think you have a typo.
Not quite sure about CentOS. Was using Debian and able to reach the number.
You need to set more File Descriptors to be able set more than 1 million. I solved that last day and it is hard to google it. Take look for sysctl fs.nr_open there is by default set 1 million and fs.file-max. Then you will be able set ulimit more than 1 million.
Petr
Hello,
We have two redis web servers behind haproxy, but i need all traffic should go to Redis-web1 only and haproxy should divert traffic to Redis-web2 only when Redis-web1 is down ?
Is this possible ? Please suggest
Thanks
Sushil R
What happens if one using haproxy to proxy traffic to remote servers?
Will the virtual network interface still work? I noticed you suing localhost which means apps will be running locally where haproxy is, but for cases where the apps are running on another server does it mean this is still possible?
If it will be possible then does it mean i will have to create the virtual interfaces on the remote servers? I am guessing that will not be possible right?
Please let me know if you understand my question.
Thanks!!!
It's definitely doable, just creating the virtual interface will be more complicated. In the meanwhile, your remote server should be configured to accept multiple connections from the same host.
Hi,
What's the significance of having the server listen on two different port numbers to this setup? Server won't have any port exhaustion issues because it is not initiating outbound connections the same way haproxy is.
regards,
Just a demo for load balancing. No actual usage if you only have one server.
ist real production level? can you give detail specification e.g ram, proc, cpu?
I kind of forgot. It's just a normal server configuration, like 16 physical cores with 64GB RAM IIRC.
Pingback: How we fine-tuned HAProxy to achieve 2,000,000 concurrent SSL connections | Cong Nghe Thong Tin - Quang Tri He Thong
These lines increase the total file descriptors’ number to 1 million.
Next, modify /etc/security/limits.conf to add the following lines:
* soft nofile 10000000
* hard nofile 10000000
root soft nofile 10000000
root hard nofile 10000000
The above setting is harmful, it will prevent you from logging into your server. Apply this with caution
I am using haproxy to loadbalance my MQTT brokers cluster. Each MQTT Broker can handle up to 1,00,000 Connections easily. But the problem i am facing with haproxy is that is only handling upto30k connections per node. Whenever if any node is reaching near 32k connections, the haproxy CPU Would suddenly spike to 100% and now all connections start dropping.
The problem with this is, that for every 30k connection, i have to roll another MQTT broker. How can I increase it to at least 60k connections per MQTT broker node?
Note: I cannot increase virtual network interfaces in digitalocean vpc.
My config –
“`
bind 0.0.0.0:1883
maxconn 1000000
mode tcp
#sticky session load balancing – new feature
balance source
stick-table type string len 32 size 200k expire 30m
stick on req.payload(0,0),mqtt_field_value(connect,client_identifier)
option clitcpka # For TCP keep-alive
option tcplog
timeout client 600s
timeout server 2h
timeout check 5000
server mqtt1 10.20.236.140:1883 check-send-proxy send-proxy-v2 check inter 10s fall 2 rise 5
server mqtt2 10.20.236.142:1883 check-send-proxy send-proxy-v2 check inter 10s fall 2 rise 5
server mqtt3 10.20.236.143:1883 check-send-proxy send-proxy-v2 check inter 10s fall 2 rise 5
“`
I have done the net.ipv4.ip_local_port_range = 1000 65535 thing.
Running haproxy 2.4 on ubuntu 20