When it comes to choose a reliable api gateway (especially for microservice based applications), we have many options indeed and if you examined these options, you probably came across with Kong as well. As an open source enthusiast and a believer in the “keep it simple” philosophy, I’ve always liked projects such as Kong that addresses a serious problem in a simple and clean way.
Though, there are many good articles related to Kong, I wanted to share my opinions and write a “yet another tutorial” about installing, configuring and securing Kong for production environments. So, this article will be such a “quick start guide” that includes bit more information for production use.
If you already know what Kong is and its features, just jump to Installation & Configuration section.
What is Kong!
Simply, it’s an API gateway, that sits in front of your backend services and forwards client requests to those services. Since Kong is the first point that client requests hit, it can transform and/or routes any requests or responses based on your configuration. So you can manipulate any requests or responses by adding, appending or removing headers on the fly and route them to wherever you want.
What’s more, as an abstraction layer it provides additional features with its plugins such as authentication, rate limiting, caching, logging, bot detection, CORS, IP restriction and many more. This means, that developers don’t need to implement these features in each micro services, thus they can be more focused on business logic which improves productivity.
Kong is a kind of “platform agnostic” solution that runs on almost every platform. You can run it as a docker container or install on a GNU/Linux distribution in on-premise or cloud based environments. Even, it can run as an kubernetes ingress controller.
Architecture
Here is a simple diagram to see how Kong works and how it extends your application ecosystem with its features.
How It Works
Kong is built on top of Nginx, -this is why it’s highly extensible- and by using its own admin api, it configures the nginx server to provide proxying for backend services.
Kong uses OpenResty which is a kind of enhanced version of nginx that includes lua-nginx-module. Thus Kong is distributed along with OpenResty to leverage its lua scripting capability to configure and extend nginx.
In other and simple words, Kong accepts your reverse proxying directives via its own Admin API over http/s, translates them to nginx configuration using lua scripting language and builds a gateway platform for micro service backends.
Admin API
When you install Kong on a host, the Admin API will be accessible on tcp/8001(http) and tcp/8444(https) ports. By interacting the Admin Api , you can configure Kong dynamically.
As a security best practice, Admin API ports are for internal use and should be firewalled.
Proxying
On the other hand Kong listens tcp/8001(http) and tcp/8443(https) ports to receive end-user requests and according the configuration you define, forwards to your endpoints. Those ports are publicly-facing services that must allow external connections.
Datastore
Kong uses an external database to store configurations data. You passes your configuration directives over admin API and Kong stores them in its datastore. And multiple kong servers can use a central database. (Which makes the gateway platform high-available and scalable. I’ll mention this in detail on Clustering section.)
Supported datastores are Cassandra and PostgreSQL. Though Cassandra is preferred datastore to build a multi-regional Kong environment, PostgreSQL can also be used for production without any issues. (In this post, we’ll use a Postgres replication for the sake of simplicity.)
Cache
As you know, when it comes to performance, databases can be easily become a main bottleneck. In order to avoid this and improve performance, Kong maintains a cache to hold the configuration data in the node’s memory.
No matter which datastore engine you use, when you call admin API to configure Kong, configuration data is written to the external datastore, then pulled from the external db and written to the node’s memory as cache; finally, configuration is applied.
At this point there is no need for a database roundtrip anymore while proxying requests. If you do any configuration change, Kong invalidates cache and pulls the new configuration to cache it again. This approach boost Kong’s performance obviously.
In order to avoid to break Kong’s cache invalidation mechanism, you should not manipulate the datastore db directly.
DB-less Mode and Declarative Configuration
As I mentioned in Datastore section, kong requires a database for the configuration data, however there is another option called DB-less mode. In this mode the configuration directives is declared as a JSON or YAML formatted configuration file, thus Kong reads this file to get configuration directives and write it to node’s memory.
With this mode, you can manage Kong by using a CI/CD tool. Since, configuration directives are stored in a static file, you can keep it in a git repository and distribute to multiple kong nodes. However in this mode Admin API is read-only, because the only way to configure Kong is using a declarative configuration file. In other words, you can’t pass any configuration directives to the admin api over RESTFull api.
Moreover, some useful plugins are not compatible with this mode, since they require a database by design. For example, rate-limiting plugin requires to a database to hold limiting values.
This mode may seem useless for production use however it is also a part of another mode called Hybrid mode that can be used create a Kong cluster. Next section I’ll explain this mode as well.
Clustering
Probably one of the most reliable service within the infrastructure should be API gateways and we don’t want to create a SPOF on this layer. Since they are act as the entry point to your backend services, any outages or performance issues couldn’t be tolerated on this layer. Needless to say, if the gateway dies, practically the application dies too. So it should be redundant and scalable.
As I mentioned before, Kong uses a datastore to holds its configurations and multiple kong node can use this central datastore to form a cluster.
A typical kong cluster looks like:
Note that, Kong is not a load balancer, you need a load-balancer in front of your Kong cluster to distribute your traffic.
So, if you need to handle more incoming requests, you can easily add more nodes horizontally by installing Kong on your new nodes and pointing them to the central datastore.
This simplicity and flexibility makes Kong to an automation suitable platform as well. For example, you can track your Kong cluster resources utilization (by collecting metrics via its prometheus plugin) and decide to scale up or down automatically. Kong installation and initial configuration is a quite straightforward process which can be easily automated.
Hybrid Mode Clustering
Classically, Kong nodes depend to the central datastore, but this is not the only option, because there is another way called Hybrid mode to build a Kong cluster.
In this mode, Kong nodes are separated into two roles called Control Plane Nodes(CP) and Data Plane Nodes(DP)
While, CP nodes -which provides Admin API-, are the only components that interact with the central datastore, DP nodes -which handle incoming requests- fetch their configuration from a CP node.
Here’s a good demonstration of two Kong cluster modes.
As you noticed, this mode is used db and db-less mode both, this is why it’s called hybrid.
It has some benefits such as reducing datastore load, improving security and easiness of management.
Though this mode is very suitable for distributed environments, I’m gonna configure Kong cluster in its classical way on this article for the sake of simplicity.
For further information about hybrid mode, please check its documentation https://docs.konghq.com/2.0.x/hybrid-mode/
Features
Kong has many useful features (known as plugin). These plugins either comes with Kong as bundled or provided by the community. As I said briefly, it’s using OpenResty’s ngx_http_lua_module to provide its core configuration management and plugin management infrastructure and this makes possible to write new plugins for Kong to extend it further. Thanks to its wise approach, there are many plugins on the Kong Plugin Hub!
Here are some useful plugins to worth mention:
Authentication Mechanism
Authentication plugins, provide bunch of authentication mechanisms to the backend services; and it’s possible to use different mechanism for each service. For example, while service-A is using basic authentication, service-B only accepts request with JWT token.
One of the well-known benefit to use an API gateways, makes it possible to offload the authentication overhead to the gateway. Thus, developers can be better focused on business logic of their backends without concerning about authentication implementation to each backend service. (Of course you need to be sure that the services are well isolated to unrelated connections nonetheless.)
Some of the mechanisms that supported by Kong is:
- Basic Auth
- LDAP integration
- OAuth 2 integration
- Key based auth
- HMAC auth
- JWT tokens
Security
Kong has some goodies to add security related capabilities to the endpoints in an effortless way. For example, by using bot detection plugin, you can detect and block any kind of requests that initiated by specific bots or custom clients. Or with its IP restriction plugin, you can easily restrict access to your services by either whitelisting or blacklisting IP addresses.
Here is built-in plugins come with Kong open source edition. (It has a commercial version named Kong Enterprise which includes more features.)
- ACME (Let’s encrypt and ACMEv2 integration)
- CORS
- IP Restriction
- Bot Detection
Traffic Control
Traffic control plugins are very useful features that everyone might be needed. For instance, Proxy cache plugin, caches mostly responded request in Kong nodes’ memory (we will mention that what a kong node is later).
As you know caching mechanisms are important things to reduces the load on the backends especially related to the data layers. If you cache frequently requested responses, that runs a bunch of heavy SQL queries on the backends, then you can reduce load on your db and related services.
Other plugins are:
- Rate limiting
- ACL (defines, who can access the services)
- Request body size limiting
- Request terminating (Terminates certain request with a specific status code.)
Analytics & Monitoring
When it comes to troubleshooting, (especially for performance related issues) one of the most important thing is “metrics” for sure! Particularly, if we’re talking about microservice environments, metrics become an imperative situation.
Though microservices offer a lot of benefits (which I won’t mention them, since this is not a post about it), also brings its challenges too. Unlike old school monolith applications, decoupled nature of microservices increase the complications of the visibility. So, on a typical microservice environment (since there are bunch of independent services that talk to others), you need to collect good quality metrics and correlate them in an understandably approach in order to understand how performant your environment is!
Well, Kong is an api gateway and it sits in between clients and services. (also you can place it in between your microservices as well.) therefor you can grab a lot of meaningful metrics from it such as latency values of your endpoints or you can collect tracing spans for any api calls. This is why Kong has these plugins below:
- Prometheus
- Zipkin
- Datadog
Logging
Needless to say, another “must have” feature for distributed applications is logging.
Logs say a lot! Based on their importance level, they should be collected, parsed, stored and visualized. Nowadays, we use logs to produce metrics that processed on the modern monitoring systems, then we use these metrics to fire alerts on serious events. They are became important resources more than ever.
So, these plugins ship logs to remote logging systems in various ways to make them to be processed!
- Syslog
- HTTP Log
- StatsD
- TCP/UDP Log
Transformation
Another useful subset of plugins are transformation related ones. These plugins can be used for manipulate any requests or responses on the fly. For example you can add, remove or replace header, body or querystrings of the requests or responses. Also there is a good plugin called Correlation ID, which injects unique ID to requests or responses in order to track http traffic flow.
- Request & Response Transformer
- Correlation ID
- gRPC Web
- Inspur Request & Response Transformer
Installation & Configuration
In this section, three separate CentOS 7 instances will be used to build a kong cluster. And in order to keep it simple, every components will be run on top of Docker.
Here are the components:
- Datastore
As datastore, we’ll use a PostgreSQL docker container on one of the CentOS instance.
Note that your database should be redundant for production environments. So if you gonna deploy Kong for production use, you need to use a PostgreSQL replica cluster or Cassandra Cluster in order to avoid a single point of failure.
- Kong Nodes
Two other CentOS instances will be the Kong nodes.
Docker and Kong Nodes: It’s OK to run kong nodes as docker containers in production, but they should also be redundant. So don’t run multiple kong nodes in a single instance.
Finally, my example environment ip addresses looks below:
kong-datastore: 10.10.10.11
kong-node-01: 10.10.10.21
kong-node-02: 10.10.10.22
Step 1 — Prepare Nodes
If you don’t have docker in your nodes, install its community edition as following steps:
## Install epel repo and then install jq
yum install epel-release -y && yum install jq -y
## Install docker-ce related packages
yum install yum-utils device-mapper-persistent-data lvm2 -y
## Enable docker-ce repo and install docker engine.
yum-config-manager — add-repo https://download.docker.com/linux/centos/docker-ce.repoyum install docker-ce -ysystemctl enable docker && systemctl start docker
## Install latest docker-compose
LATEST_VERSION=$(curl -s https://api.github.com/repos/docker/compose/releases/latest | jq -r ‘.tag_name’)
curl -L “https://github.com/docker/compose/releases/download/$LATEST_VERSION/docker-compose-$(uname -s)-$(uname -m)” > /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose
Step 2 — Deploy PostgreSQL Container on Datastore Node
Connect your kong-datastore instance and run:
docker run -d — name kong-datastore \
— restart always \ -p 5432:5432 \
-e “POSTGRES_USER=kong” \
-e “POSTGRES_DB=kong” \
-e “POSTGRES_PASSWORD=kong” \
postgres:9.6
Now, we started a PostgreSQL 9.6 container on datastore node. As we passed the environment variables, a database named kong is created. So we can connect it with kong/kong credentials.
Important note: Again, this PostgreSQL deployment is just for demonstration purpose, you need to use an external PostgreSQL replica or Cassandra cluster for production use.
Step 3 — Prepare Kong Database
Next, switch to one of your kong node instance and run bootstrap command to make initial configuration on the db, such as creating schema etc:
docker run — rm \
-e “KONG_DATABASE=postgres” \
-e “KONG_PG_HOST=10.10.10.11” \
-e “KONG_PG_USER=kong” \
-e “KONG_PG_PASSWORD=kong” \
kong:latest kong migrations bootstrap
As you noticed, we’ve pointed it to our postgres host via KONG_PG_HOST parameter.
Step 4 — Starting Kong Nodes
Finally, start kong container on your every kong node instances:
#### Tweaks tcp stack to better request handling.
cat <<EOF >>/etc/sysctl.d/95-kong-specific.conf
net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.tcp_keepalive_time = 300
net.ipv4.tcp_keepalive_intvl = 60
net.ipv4.tcp_keepalive_probes = 20
net.ipv4.tcp_max_syn_backlog = 16384
net.core.somaxconn = 16384
net.core.netdev_max_backlog = 50000
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 60
fs.file-max = 500000
EOF
sysctl -p /etc/sysctl.d/95-kong-specific.conf#### Runs Kong node container.
docker run -d — name kong-node \
-e “KONG_DATABASE=postgres” \
-e “KONG_PG_HOST=10.10.10.11” \
-e “KONG_PG_USER=kong” \
-e “KONG_PG_PASSWORD=kong” \
-e “KONG_PROXY_ACCESS_LOG=/dev/stdout” \
-e “KONG_ADMIN_ACCESS_LOG=/dev/stdout” \
-e “KONG_PROXY_ERROR_LOG=/dev/stderr” \
-e “KONG_ADMIN_ERROR_LOG=/dev/stderr” \
-e “KONG_ADMIN_LISTEN=0.0.0.0:8001 reuseport backlog=16384, 0.0.0.0:8444 http2 ssl reuseport backlog=16384” \
-e “KONG_PROXY_LISTEN=0.0.0.0:8000 reuseport backlog=16384, 0.0.0.0:8443 http2 ssl reuseport backlog=16384” \
-e “KONG_ANONYMOUS_REPORTS=off” \
-p 8000:8000 \
-p 8443:8443 \
-p 127.0.0.1:8001:8001 \
-p 127.0.0.1:8444:8444 \
— restart always \
— sysctl net.core.somaxconn=16384 \
kong:latest
With this command set, we first tweaked the kernel to get better performance and then started kong node with some configuration parameters which we passed them as environment variables.
Here’s variable explanations:
KONG_DATABASE and KONG_PG_*
Defines which datastore that kong node will connect.
KONG_{PROXY|ADMIN}_{ACCESS|ERROR}_LOG
As a common containerized application approach, this parameters redirect admin api and proxy http logs to stdout and stderr by its category. (There are some good logging plugins to ship the logs remote, that you can enable regardless of these directives.)
KONG_{ADMIN|PROXY}_LISTEN
These environment variables, configure addresses and ports on which the proxy and admin api should listen for HTTP/HTTPS traffic.
According to our configuration, the admin api will listen to tcp/8001 for http and tcp/8444 for https traffic and proxy server will listen to tcp/8000 for http and tcp/8443 for https.
Note that, backlog=N parameter, sets the maximum length for the queue of pending TCP connections. This number should not be too small in order to prevent clients seeing “Connection refused” error connecting to a busy Kong node.
And this is related to net.core.somaxconn parameter, so you need to increase this parameter at the same time to match or exceed the backlog number. This is why we’ve tweaked kernel parameters and used “ — sysctl net.core.somaxconn=16384” line in our docker run command above.
KONG_ANONYMOUS_REPORTS
By default, anonymous reporting feature that sends anonymous usage data such as error stack traces to help improve Kong, is enabled. But I think it’s not good for any production environments. So we disabled it.
Step 5 — Check if Kong Nodes are running
Now, kong node should be run and provides its admin api via tcp/8001 or tcp/8444 on localhost.
So you can call it from any of kong node’s localhost address with curl like:
# HTTP
curl -i http://localhost:8001/
# HTTPS
curl -i — insecure https://localhost:8444/
If everything is OK, it should return a JSON output that shows current configuration.
Securing Kong Admin API
As you noticed, when we run node kong container, we exposed admin api port on localhost with “-p 127.0.0.1:8001:8001 and -p 127.0.0.1:8444:84444“ parameters which means the admin api wouldn’t be accessible other than localhost. Since the admin api is the most important component that provides a RESTful interface for administration and configuration purposes, it should be well protected. So it’s a good habit to not expose it to network by default as we did.
But at the end of the day you will need to use the admin api remotely. In order to expose the admin api securely, adding fine-grained access control measures such as authentication and authorization is a good idea.
In this section, we will leverage kong itself to provide proxy to its own admin api. This will be also a good practice to learn how to use Kong.
Step — 1 Create a service
Services are used to refer to the upstreams which the incoming requests will be passed.
So with this command below, we’ll create a service called admin-api which refers the localhost’s port 8001 as an upstream.
curl -X POST http://localhost:8001/services \
— data name=admin-api \
— data host=localhost \
— data port=8001
Step — 2 Create a route to pass requests to the admin-api service
Routes specify how (and if) requests are sent to their Services after they reach the API gateway.
So we’re gonna create a route under the admin-api service to tell Kong that if it receives any incoming requests for the /admin-api endpoint, pass it to the admin-api service.
curl -X POST http://localhost:8001/services/admin-api/routes \ — data paths\[\]=/admin-api
As our admin-api service is ready, we can call it via curl from a remote host.
# Node01
curl -s http://10.10.10.21:8000/admin-api/# Node02
curl -s http://10.10.10.22:8000/admin-api/
As you can see, this time we access the Admin API by sending our request to Kong’s proxy port on its LAN address. So our request hits to Kong proxy and it returned response by fetching from the admin-api upstream (http://localhost:8001)
Step 3 — Apply Authentication Control to admin-api Service
At this point, we will apply an authentication control to our service, in order to deny any unauthorized accesses.
Though, Kong has a bunch of security related plugins that we mentioned above, I’ll use basic authentication for demonstration purpose. To see other options please take a look at https://docs.konghq.com/hub/#security
Step 3.1 — Enable basic-auth Plugin
This plugin adds basic authentication to a service or a route with username and password protection. You can enable it for each services or routes individually. As we want to protect the Admin API, we enable it for admin-api service:
curl -X POST http://localhost:8001/services/admin-api/plugins \
— data “name=basic-auth” \
— data “config.hide_credentials=true”
Notice that we define our service name (admin-api) in the URL and this will enable basic-auth plugin for service called admin-api.
Step 3.2 — Create a consumer
Consumers are associated to individuals that uses services. You can think them as users. Now, we create a consumer called my-user for our admin-api service:
curl -d “username=my-user&custom_id=1” http://localhost:8001/consumers/
Note: consumer_id is a field for storing an existing unique ID for the consumer.
Next we create s credential for our consumer:
curl -X POST http://localhost:8001/consumers/my-user/basic-auth \ — data “username=my-user” \ — data “password=my-password”
As you can see, our username is my-user and password is my-password
Step 3.3 — Using the credential
OK we got a consumer and its credentials, so we can call admin-api service with our username and password pair.
To do this, we need to use Authorization (or Proxy-Authorization) header in our request. This authorization header must be base64 encoded version of username:password
So, we can first encode our credentials, and then send request with the proper header to kong node.
# encode the credential
echo “my-user:my-password” |base64
bXktdXNlcjpteS1wYXNzd29yZAo=# and make the request:
curl -s -X GET \
— url http://10.10.10.21:8000/admin-api \
— header ‘Authorization: Basic bXktdXNlcjpteS1wYXNzd29yZAo=’
As we enable basic-auth plugin on admin-api service, any unauthorized access will be denied; for example if you don’t define the auth header on your request, you’ll be rejected with the message below :
curl -s -X GET \
— url http://10.10.10.21:8000/admin-api
{“message”:”Unauthorized”}%
Well, at this point we added basic-auth for the Admin Api. Though, it can be still accessible via localhost:8001(or :8444) on Kong nodes, you need to provide your credentials to interact from a remote location.