Elastic Stack is a log collection and analysis solution that combines three (and a few more) open-source projects:
Furthermore, offers a nice collection of light-weight client-side log shippers, called Beats.
(Since the Beats are Go-based, it easy to compile filebeat
for a MIPS64-based EdgeRouter Lite, for example.)
Elasticsearch is a database that specializes in fast full-text searches. It offers a RESTful HTTP API, and returns JSON. Elasticsearch is distributed, highly scalable, and supports multi-tenancy.
Elasticsearch is a document-based database that stores schema-free JSON-like objects, rather than the structured rows and columns of a database like MySQL. Elasticsearch isn’t a normalized relational database. Internally, Elasticsearch stores documents as set of keys and values. Keys are strings, and values may be various types like strings, numbers, dates, or lists.
We can add documents to Elasticsearch with a simple HTTP POST, or search the database with a GET.
Elasticsearch partitions documents into shards. Shards can be distributed and replicated across nodes in an Elasticsearch cluster for performance and redundancy.
Elasticsearch organizes documents into indexes. Elasticsearch indexes differ from what traditional relational databases call “indexes”. Elasticsearch index, as collections of documents, are more analogous to what MySQL calls a “database”.
What are the down-sides of Elasticsearch? It uses a ton of RAM. No, a TON. It’s also very complex, though it takes reasonable steps to hide that complexity from new/casual users.
Elasticsearch can be thought of as an effort to make the complex power of the Apache Lucene Java library accessible as a relatively simplified, RESTful API supported by multiple languages.
https://www.elastic.co/guide/en/elasticsearch/guide/current/intro.html
Logstash is the fulcrum of the ELK stack. It’s basically a filter, in the general sense and like a Unix filter. It can receive input from a variety of sources: the various beats programs, syslog, SNMP, HTTP, RSS, Cloudwatch, the output of shell commands, or even IRC. It can also output to a variety of destinations.
Read this:
https://www.elastic.co/blog/a-practical-introduction-to-logstash
https://discuss.elastic.co/t/high-cpu-usage-with-clean-logstash-install/106598 (Logstash CPU use can run wild without any config!)
A configuration can be split into several files (e.g., in /etc/logstash/confd/
).
Because Logstash does input→(queue)→filter→output, our configuration minimally needs one input, one filter section, and one output.
Logstash can also have multiple inputs, filters, and outputs.
Logstash concatenates multiple config files in lexicographical order of file names.
So, if the order in the config matters, name them like 10-input-foo.conf
and 20-input-bar.conf
or whatever.
https://www.elastic.co/guide/en/logstash/current/input-plugins.html
https://www.elastic.co/guide/en/logstash/current/filter-plugins.html
https://www.elastic.co/guide/en/logstash/current/output-plugins.html
https://www.elastic.co/guide/en/logstash/current/config-examples.html
Play with a toy configuration:
--- elk ~ $ sudo systemctl stop logstash
--- elk ~ $ cat ./logstash-stdout.conf
input {
file {
path => ["/home/logstash/testdata.log"]
sincedb_path => "/dev/null"
start_position => "beginning"
}
}
filter {
}
output {
stdout {
codec => rubydebug
}
}
A more realistic configuration might look like this:
--- elk ~ # vim /etc/logstash/conf.d/input-beats.conf
--- elk ~ $ cat /etc/logstash/conf.d/input-beats.conf
input {
beats {
port => 5044
}
}
--- elk ~ $ sudo vim /etc/logstash/conf.d/input-syslog.conf
input {
syslog {
port => 5514
type => “syslog_server”
}
}
--- elk ~ # vim /etc/logstash/conf.d/filter-syslog.conf
--- elk ~ $ cat /etc/logstash/conf.d/filter-syslog.conf
filter {
if [fileset][module] == "system" {
if [fileset][name] == "auth" {
grok {
match => { "message" => ["%{SYSLOGTIMESTAMP:[system][auth][timestamp]} %{SYSLOGHOST:[system][auth][hostname]} sshd(?:\[%{POSINT:[system][auth][pid]}\])?: %{DATA:[system][auth][ssh][event]} %{DATA:[system][auth][ssh][method]} for (invalid user )?%{DATA:[system][auth][user]} from %{IPORHOST:[system][auth][ssh][ip]} port %{NUMBER:[system][auth][ssh][port]} ssh2(: %{GREEDYDATA:[system][auth][ssh][signature]})?",
"%{SYSLOGTIMESTAMP:[system][auth][timestamp]} %{SYSLOGHOST:[system][auth][hostname]} sshd(?:\[%{POSINT:[system][auth][pid]}\])?: %{DATA:[system][auth][ssh][event]} user %{DATA:[system][auth][user]} from %{IPORHOST:[system][auth][ssh][ip]}",
"%{SYSLOGTIMESTAMP:[system][auth][timestamp]} %{SYSLOGHOST:[system][auth][hostname]} sshd(?:\[%{POSINT:[system][auth][pid]}\])?: Did not receive identification string from %{IPORHOST:[system][auth][ssh][dropped_ip]}",
"%{SYSLOGTIMESTAMP:[system][auth][timestamp]} %{SYSLOGHOST:[system][auth][hostname]} sudo(?:\[%{POSINT:[system][auth][pid]}\])?: \s*%{DATA:[system][auth][user]} :( %{DATA:[system][auth][sudo][error]} ;)? TTY=%{DATA:[system][auth][sudo][tty]} ; PWD=%{DATA:[system][auth][sudo][pwd]} ; USER=%{DATA:[system][auth][sudo][user]} ; COMMAND=%{GREEDYDATA:[system][auth][sudo][command]}",
"%{SYSLOGTIMESTAMP:[system][auth][timestamp]} %{SYSLOGHOST:[system][auth][hostname]} groupadd(?:\[%{POSINT:[system][auth][pid]}\])?: new group: name=%{DATA:system.auth.groupadd.name}, GID=%{NUMBER:system.auth.groupadd.gid}",
"%{SYSLOGTIMESTAMP:[system][auth][timestamp]} %{SYSLOGHOST:[system][auth][hostname]} useradd(?:\[%{POSINT:[system][auth][pid]}\])?: new user: name=%{DATA:[system][auth][user][add][name]}, UID=%{NUMBER:[system][auth][user][add][uid]}, GID=%{NUMBER:[system][auth][user][add][gid]}, home=%{DATA:[system][auth][user][add][home]}, shell=%{DATA:[system][auth][user][add][shell]}$",
"%{SYSLOGTIMESTAMP:[system][auth][timestamp]} %{SYSLOGHOST:[system][auth][hostname]} %{DATA:[system][auth][program]}(?:\[%{POSINT:[system][auth][pid]}\])?: %{GREEDYMULTILINE:[system][auth][message]}"] }
pattern_definitions => {
"GREEDYMULTILINE"=> "(.|\n)*"
}
remove_field => "message"
}
date {
match => [ "[system][auth][timestamp]", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
geoip {
source => "[system][auth][ssh][ip]"
target => "[system][auth][ssh][geoip]"
}
}
else if [fileset][name] == "syslog" {
grok {
match => { "message" => ["%{SYSLOGTIMESTAMP:[system][syslog][timestamp]} %{SYSLOGHOST:[system][syslog][hostname]} %{DATA:[system][syslog][program]}(?:\[%{POSINT:[system][syslog][pid]}\])?: %{GREEDYMULTILINE:[system][syslog][message]}"] }
pattern_definitions => { "GREEDYMULTILINE" => "(.|\n)*" }
remove_field => "message"
}
date {
match => [ "[system][syslog][timestamp]", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
}
}
}
--- elk ~ # vim /etc/logstash/conf.d/output-elastic.conf
--- elk ~ $ cat /etc/logstash/conf.d/output-elastic.conf
output {
elasticsearch {
hosts => ["localhost:9200"]
manage_template => false
index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
}
}
Test the config:
--- elk ~ $ sudo -u logstash /usr/share/logstash/bin/logstash --path.settings /etc/logstash -t
Logstash plugins are Ruby gems that add functionality for input, output, and filtering.
--- elk ~ $ /usr/share/logstash/bin/logstash-plugin list
--- elk ~ $ sudo /usr/share/logstash/bin/logstash-plugin install logstash-input-syslog
https://www.elastic.co/guide/en/logstash/current/working-with-plugins.html
Because Logstash runs on the Java VM, it always uses the maximum amount of memory allocated to it.
Kibana visualized Elasticsearch data.
Beats are client-side programs that ship logs to Logstash. Elastic stack ships a number of beats, with more developed by the community. Written in Go, the beats have no dependencies, making them easy to deploy.
Spin up a Debian VM.
$ virt-install --connect qemu:///system \
--name=elk \
--ram=4096 \
--vcpus=1 \
--cdrom=/data/d/debian-9.7.0-amd64-netinst.iso \
--os-variant=debian9 \
--disk path=/data/d/elk.qcow2,size=40 \
--network bridge=br0,mac=RANDOM \
--graphics vnc
Connect to the VM:
root@elk:/home/paulgorman# apt-get install sudo tmux git vim-nox curl nginx-light apt-transport-https openjdk-8-jdk-headless
root@elk:/home/paulgorman# usermod -a -G sudo paulgorman
root@elk:/home/paulgorman# wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
root@elk:/home/paulgorman# echo "deb https://artifacts.elastic.co/packages/6.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-6.x.list
root@elk:/home/paulgorman# apt-get update && apt-get install elasticsearch
root@elk:/home/paulgorman# systemctl daemon-reload
root@elk:/home/paulgorman# systemctl enable elasticsearch.service
root@elk:/home/paulgorman# systemctl start elasticsearch.service
root@elk:/home/paulgorman# curl -X GET "localhost:9200/"
root@elk:/home/paulgorman# apt-get install kibana
root@elk:/home/paulgorman# systemctl daemon-reload
root@elk:/home/paulgorman# systemctl enable kibana.service
root@elk:/home/paulgorman# systemctl start kibana.service
root@elk:/home/paulgorman# apt-get install logstash
root@elk:/home/paulgorman# systemctl enable logstash.service
root@elk:/home/paulgorman# systemctl start logstash.service
Nginx reverse proxies for Kibana simply to restrict access with basic auth. Do TLS termination with Nginx too.
--- elk ~ $ sudo cp STAR_example_com.key /etc/ssl/private/
--- elk ~ $ sudo cp STAR_example_com.crt /etc/ssl/certs/
--- elk ~ $ echo "kibanaadmin:`openssl passwd -apr1`" | sudo tee -a /etc/nginx/htpasswd
--- elk ~ $ sudo vim /etc/nginx/sites-enabled/elk.example.com
server {
listen 80;
server_name elk.example.com;
auth_basic "Restricted Access";
auth_basic_user_file /etc/nginx/htpasswd;
location / {
proxy_pass http://localhost:5601;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
}
}
server {
listen 443 ssl;
ssl_certificate /etc/ssl/certs/STAR_example_com.crt;
ssl_certificate_key /etc/ssl/private/STAR_example_com.key;
server_name elk.example.com;
auth_basic "Restricted Access";
auth_basic_user_file /etc/nginx/htpasswd;
location / {
proxy_pass http://localhost:5601;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
}
}
--- elk ~ $ sudo systemctl restart nginx.service
https://discuss.elastic.co/t/high-cpu-usage-with-clean-logstash-install/106598
--- elk ~ $ sudo apt-get install nftables
--- elk ~ $ sudo vim /etc/nftables.conf
#!/usr/sbin/nft -f
flush ruleset
table inet filter {
chain input {
type filter hook input priority 0; policy drop;
iif lo accept
ct state established,related accept
tcp dport { 22, 443, 5044, 5514 } ct state new accept
udp dport 5514 accept
ip protocol icmp icmp type { destination-unreachable, time-exceeded, parameter-problem } accept
ip6 nexthdr icmpv6 icmpv6 type { nd-neighbor-solicit, nd-router-advert, nd-neighbor-advert } accept
}
chain forward {
type filter hook forward priority 0; policy drop;
}
chain output {
type filter hook output priority 0; policy accept;
}
}
--- elk ~ $ sudo systemctl enable nftables
--- elk ~ $ sudo systemctl start nftables