Geek Ramblings: System Administration

Showing posts with label System Administration. Show all posts

Thursday, 7 December 2023

Multi-Factor Authentication (MFA) over ssh

Securing your internet-facing systems and data is of utmost importance. One critical aspect is ensuring secure access to your servers and protecting them from unauthorised access. Multi-Factor Authentication (MFA) adds an additional layer of security by requiring users to provide two or more pieces of evidence to authenticate themselves. In this tutorial, we will explore how to set up MFA for SSH using FreeOTP, an open-source OTP (One-Time Password) authenticator app.

Prerequisites

Before we begin, make sure you have the following:

A Linux server (Ubuntu, CentOS, or any other distribution)
Administrative access to the server
A smartphone (iOS or Android) to install the FreeOTP app

Step 1: Installing FreeOTP

On your smartphone, open the respective app store (Google Play Store or Apple App Store).
Search for "FreeOTP" and install the app.
Once installed, open the FreeOTP app.

Step 2: Configuring SSH for MFA

Connect to your server using SSH with administrative privileges.
Open the SSH configuration file using a text editor (e.g., nano or vi).
- sudo vi /etc/ssh/sshd_config
Look for the ChallengeResponseAuthentication line and set it to "yes" if not already enabled.
Add the following line to enable the use of Google Authenticator-compatible TOTP (Time-based One-Time Password) authentication:
```
AuthenticationMethods publickey,password publickey,keyboard-interactive
```
Save and exit the SSH configuration file.

Step 3: Configuring the User for MFA

In the SSH configuration file, find the Match User or AllowUsers section for the user you want to enable MFA for.

Add the following line below the user entry:

AuthenticationMethods publickey,password publickey,keyboard-interactive

Save and exit the SSH configuration file.

Step 4: Restarting the SSH Service

Restart the SSH service to apply the changes.
- sudo systemctl restart sshd

Step 5: Enabling MFA for the User

On your server, generate a secret key for the user using the following command:
```
google-authenticator
```
You will be presented with a series of prompts. Answer "y" for each of them to configure MFA.
Scan the displayed QR code using the FreeOTP app on your smartphone.
FreeOTP will add your server as a new account and start generating one-time passwords.
Complete the setup process by following the on-screen instructions.

Step 6: Testing the MFA Setup

Attempt to SSH into your server using the user account that has MFA enabled.
After entering the username and password, you will be prompted for the verification code.
Open the FreeOTP app on your smartphone and find the account associated with your server.
Enter the current one-time password generated by FreeOTP.
If the authentication is successful, you will gain access to your server.

By implementing MFA for SSH using FreeOTP, you have taken a significant step towards bolstering the security of your server. MFA provides an additional layer of protection against unauthorised access and greatly reduces the risk of compromised user credentials. Remember to enforce strong passwords and regularly update your system to maintain a robust security.

In future tutorials (when I get the time) I will show how to further harden your system with fail2ban and geoblocking.

Tuesday, 20 April 2021

How SORBS ruined my life (old blog)

When this blog entry was written, Open Relay blockers like SORBS were considered the primary defence mechanism against spam email. They are still often used, but not as much as they used to be - for many of the reasons discussed in this article. SPF, DKIM & DMARC have largely taken their place.

I've had a helluva week so far, and I've got SORBS to thank for it.

SORBS (Spam and Open Relay Blocking System) is an email server reputation service. It (along with other reputation services) collect data on servers that are are either poorly configured (and open to abuse) or actively send spam. Reputation services are an excellent way of protecting against spam. They have very high hit rates and (generally) very low false positives. They also have a very low performance overhead compared with heuristic and bayesian filtering techniques. The mail servers I admister are configured to quarantine all email that is listed by either SORBS, SpamHaus or SpamCop (two other reputation services).

However, sometime last week (when I was off work sick), SORBS listed pretty much all of the hotmail, windows live, yahoo, bigpond, optusnet and myplace servers. There may have been others listed as well.

Responding to a few complaints that came in on Monday about email not being received, I began checking. Then began the sinking feeling that goes along with knowing that I'd have to change the email filtering rules, do some regression tests and then resubmit a week's work of spam through the email filters. SORBS is a very aggressive filter and I've been quite reliant on it for some time.

Less fun was trying to explain the problem to the CEO.

The first part of the quick fix was to move the SORBS check to the end of the anti-SPAM rules of the DMZ mail filter and set it to monitor only - not block. The DMZ mail server is exposed to the Internet and performs basic/quick checks only and does not look at the content of the email messages. It is highly robust and is meant as the front line of defence. It experiences an average of 90 security attacks per hour. It filters about 75% of the email traffic as inappropriate before passing the "possibly okay" messages to the second filter.

The second part was to add a check for zero day threats on the DMZ server and tag messages accordingly.

The third part was to tighten the screws a bit on the second mail filter. This meant decreasing the tolerance limits a little and added a few more checks which included looking for the tags on the email messages placed there by the DMZ mail filter. Usually, this server intercepts about 8% of messages pass to it. Messages quarantined by this server may result in a quarantine message and even a self-release option depending upon the spam score the email receives.

Now that these changes were made, I resubmitted 2143 spam quarantined messages. With the new rules in place, 1963 messages were blocked and 180 passed through to the second filter which blocked a further 68 messages leaving 112 messages sailing through to the mail server. I don't know how what the spam:ham ratio of the 112 messages was but the count seemed to be manageable so I released the remaining 14,000 quarantined messages and sent an email instructing people to forward any received spam to the spam submit mailbox for heuristic analysis. So far, I have 35 spam messages submitted by users which (if accurate) we now have a spam hit rate of 99.75% with (hopefully) a close to zero false positive rate. The hit rate has dropped to 59.4% for the DMZ server and risen to 12.5% for the second filter. Time and monitoring will determine how successful the changes have been.

Moving forward, I have written replacement rules that will no longer quarantine email based upon failed reputation. The new rules will look at SPF (Send Policy Framework - RFC4088) in addition to reputation. If the SPF check passes, the email will be accepted. If SPF fails, it will be dropped. If SPF is 'softfail' or 'none' then it will be subject to a reputation check. Any failure will drop then connection with a 46x error - Temporary Failure, with details as to why sending failed. The sending MTA will then notify the recipient that the email failed to send. The sending server then has the option of implementing SPF. I'd like at some point to add a DKIM check, but that's a reasonably difficult task.

BTW, I am heavily influenced by Ming Weng Wong's whitepaper on Messaging Anti-Abuse:

http://www.openspf.org/blobs/sender-authentication-whitepaper.pdf

Tuesday, 11 December 2018

Tutorial: CentOS 7 Installation guide for vmware ESX environments (Part 1)

Although this guide is written specifically for ESX environments, it translates easily to other virtual environments. The same general rules apply even if he steps are slightly different. There is an exception for MS HyperV where its easier just to leave the FDD emulation rather than removing it.

If you're installing RHEL, Fedora, OL or Scientific Linux rather than CentOS, most of the installation will apply although OL has very different recommendations. The instructions here detail a minimal install (Micro-Instance) which will give you an very basic server. Instructions for different footprints will follow in subsequent blog entries. These instructions are based on the 7.4 release of CentOS.

Best practice is to install CentOS from either the NetInstall ISO or the Minimal ISO. Currently, the NetInstall (rev 1708) is 422MB. The Minimal is 792MB. The minimal install contains everything needed to setup a basic server. Additional services must be installed from repositories. The NetInstall requires an active Internet connection (and hence a working NIC) in order to complete even a basic installation. By contrast, the standard ISO is 4.2GiB in size.

It must be assumed that everything is out-of-date as of the initial release. So installing from the standard ISO achieves little as most packages will need to be updated from the repositories. The procedure shown will be using the NetInstall ISO.

FirstCopy the ISO to the vmware ESX server’s ISO folder in the datastore or some location that is easy for you to install from.

Minimal Server Installation (Micro-Instance)

1) Create new server in ESX using ‘typical’ configuration and give it a name.

2) Select the Datastore for the vmdk files and press next.

3) Select Operating System (Linux, CentOS 64bit)

4) Select appropriate NICs. You will need at least one that has access to the Internet. You can add others now (preferably) or later. Make sure you use the VMXNET3 adapter with connect at power on. Use of the e1000 adapter has been shown to cause problems, particularly during ESX upgrades.

5) Setup virtual disks on the datastore. For a micro-instance, we will need to have it thick provisioned. For larger servers, we can create virtual disks for other partitions – these can be thin provisioned, but any partition containing a boot or swap partition needs to be thick. For larger servers, you will likely have the data-containing partitions on a separate SAN connected via multipath.

The disk allocation for a micro-instance should be between 12GB and 32GB. Larger than that and you should allocate a second(or more) disks. If the specific calls for a specific disk size, definitely create another disk – either thick or thin provisioned. For large footprints, eager zero is preferred, but lazy zeroed is fine for a micro-instance.

6) Edit the VM settings (tick box) and click continue. Modify the settings as follows:

Memory: 2048MB is the default. For a micro-instance, this can be reduced to as low as 1024MB – particularly if you don’t plan on a GUI (which you shouldn’t anyway).
CPUs: 2 virtual sockets with 1 core per socket for micro-instance or medium. Increase only as required by application. By choosing 2 cores, SMP will be installed in the kernel. You can reduce it to one later, but make sure there are at least two at installation time.
Video Card: Leave at 4MB with 1 display unless you need a GUI. Then increase to 12MB.
CD/DVD: Set to the CentOS NetInstall or MinimalInstall ISO you added earlier. Check ‘Connect at power on’.
Floppy: Remove

7) Click ‘Finish’

8) Select newly created server in the client, connect to the console and power on.

There should be no need to test the media, so just select ‘Install Centos’. The CentOS installer will then boot:

CentOS Installation (GUI)

1) Language: Select English and your location in the world. In my case, this is Australia.

2) Installation Summary: You will need to modify ‘Date/Time’, ‘Security Policy’, ‘Installation Source’, ‘Installation Destination’ and ‘Network and Host Name’. Since this is a Net Install, we need to modify the Network and Hostname first.

3) Network and Hostname. Enable the network adapter. This will perform an automatic DHCP. Unless you plan on using DHCP for your servers, you should change this to a static address. Change the hostname to the FQDN of the server.

Use the configure button to change the name of the adapter to something easier to remember. You can also change the MAC address here. This is useful if you are creating an immutable server. Under the ‘General’ tab, select “Automatically connect to this network when it is available”. Set the IPv4 address and DNS settings. Add IPv6 if utilised. Click ‘Save’ and then ‘Done’ when finished.

3) Date/Time: Change the timezone if required. Check that NTP is enabled and working.

4) Installation Source: Enter the Centos mirror address and uncheck the mirror list checkbox. Click done and CentOS will begin updating its repository list.

5) Software Selection: I know this is tempting to select what you want here, but stick with the minimal install for now. The only time you really should select anything is if you are installing a compute node or virtualization host.

6) Installation Destination: For a micro-instance, you can settle for the defaults. This will create two physical partitions. The first (sda1) will be a 1GiB boot partition formatted for xfs (the default now under CentOS7). You can change this to ext4 if you like. The second physical partition will use LVM (Logical Volume Management) with two LVM partitions made up of a 1.5GiB swap partition and the rest allocated to root. For simple servers, this will be adequate. Otherwise, modify it here according to the size/space/capability recommendations. Everything here is completely customisable.

7) Once you click ‘done’. You will be given a list of changes the installer is about to do based upon your selections. Check them carefully before accepting.

8) Lastly, we will need to select a security policy. This is an area often overlooked and most installations will choose the default XCCDF profile which contains no rules, although rule lists can be downloaded and applied here. Unless mandated otherwise, use the Standard System Security Profile. The security profiles are administered by Red Hat and not even vetted by CentOS. You will need to consult RHEL documentation for details. In brief, it contains thirteen rules to ensure a basic level of security compliance.

There is a HUGE caveat with the use of security policies: They are something of a black art and emphasise security above all else. This can result in unpredictable changes to your system without notification.

If you want to harden the CentOS setup, I’ll deal with that later. For now, just select the Standard Profile.

Note: There is a bug with the 7.3.1611 ISO with all four STIG security policies that has been fixed with 7.4.1708. Security profiles "Standard System Security Profile" and "C2S for CentOS Linux 7" can't be used in the CentOS 7.5.1804 installer. A bug causes the installer to require a separate partition for /dev/shm, which is not possible.

We could spend hours here. We won’t. That will come with server hardening - when we get to it.

9) Next, click ‘Begin installation’ and installation will start. The next page will work concurrently with the installation.

10) Create a root user and a normal (administrative) user. Ideally, you will only ever use the root login once. After that we will disable root logins completely except from the console. To gain root privileges, you will need to elevate them using sudo by placing administrative users in the ‘wheel’ group (more on that later).

Set the root password to something long a difficult but easy to remember. Make it at least 15 characters by using a phrase with upper/lowercase and numbers eg: 1veGot4LovelyBunch0fCoconuts. Store this password in a secure place (eg keysoft secured by two factor authentication). You will only need this password once and in case of emergencies.

The second user should be your account or a GP admin user. Make sure the password used is strong.

Once installed, select ‘Reboot’. You should disconnect the ISO at some point. Once the server has booted, you will be greeted with the CLI login screen:

At this point, the installation is complete! Login with your admin user and proceed to configuration.

Tomorrows blog entry will deal with configuration.

Friday, 7 December 2018

On system security and cargo cult administration

I wrote yesterday's blog piece on Cargo Cult system administration as an intro to this post. It's essentially a warning - ignore the complexities of your systems at peril. So if you haven't read it yet, read it first before reading this entry.

Two days ago I was alerted to the fact that one of the FTP servers I had setup for a client had stopped working. This was a worry for a couple of reasons: Firstly, the server is quite simple, there's nothing that should go wrong with it; Secondly, this is the second time we've had a problem with this server this year (a unique issue which deserves its own blog entry). Two problems with a critical but simple server does not inspire confidence in a client.

As I said, the server was very basic: It's a stripped down CentOS 7.5 server running only sshd, proftpd and webmin. It's also well hardened - exceeding best practice.

I quickly checked all the basics and found the server was running normally. No recent server restarts and no updates to anything critical and certainly nothing matching the timeline given by the client when the server "stopped working". Nothing in any log files to indicate anything other than ftp login failures that started exactly when the client indicated. No shell logins to indicate intrusion or tampering. Config files have not been changed from the backup config files I always create. The test login I was given failed even after changing the password.

At the suggestion of a colleague, I changed the shell account from /sbin/nologin to /bin/bash. I indicated this wouldn't be the issue as this is something I routinely do for accounts that don't need ssh access and this was the way the server had been setup from the beginning.

However, changing the default shell did the trick!

???

I changed the default shell for all the ftp accounts and we were working again. But why? How?

The problem was now fixed and I could have easily left it at that cargo cult sysadmin style. It would have been easy to say "bug in proftpd" or something like that.

The purpose of the /sbin/nologin shell is to allow accounts to authenticate to the server without shell access. When you establish an ssh session, your designated shell is run. These days it is usually bash but it can be anything csh, ksh, zsh, tcsh etc. There are oodles of shells. The nologin shell is a simple executable that displays a message saying the account is not permitted shell access and exits. Simple, yet effective. The list of acceptable shells is located in /etc/shells which most sysadmins don't bother editing, they leave it at defaults unless an unusual shell is used. It includes bash and nologin.

I checked the PAM entry for proftpd. The PAM files lists the requirements for successful authentication to each service. It's very powerful and quite granular. The PAM file for proftpd contained:

This was expected: Proftpd authentication is by system password except for users in the deny list and requires a valid shell as defined by the pam_shells.so module. This checks /etc/shells for a valid shell. So, checking /etc/shells:

Huh? /sbin/nologin is absent! I checked the datestamps for /etc/shells and /sbin/nologin - the datestamp is Oct 31 - Just over a month ago????

This doesn't seem to add up. I interrogate the yum database to see if there have been any changes at all with proftpd or util-linux (which supplies both of the files). Yum doesn't show any modification even though the file dates don't match the repo.

Outside of an intrusion, what could modify a this file? It took some digging to find a potential culprit: SCAP (Security Content Automation Protocol).

The installation of CentOS/RHEL provides the following option under the heading "Security Policy":

This is where cargo cult system administration rears its head. The usual maxim is don't install or configure something you don't understand. If you do need it, make sure you understand it.

This screen is a front-end for scripts to ensure compliance with various security policies. It's dynamic and is designed to react to security threats in real-time. It's a powerful tool. It's also horrendously complex and something of a black art. Acknowledging this, the open-scap foundation advise:

There is no need to be an expert in security to deploy a security policy. You don’t even need to learn the SCAP standard to write a security policy. Many security policies are available online, in a standardized form of SCAP checklists.

Very comforting. Don't look to closely, just check the box and forget we exist. Ignore the Law of Leaky Abstractions and allow the script to take care of your security policies.

Notice the above screen shows the default contains no rules. This is from the CentOS 7.2 installation. From the CentOS wiki on Security Policy:

Previously, with the 7.3.1611 ISOs, we knew that all 4 of the STIG installs produced an sshd_config file that would not allow SSHD to start. This was an upstream issue (Bug Report bz 1401069). This issue has been fixed with the 7.4.1708 ISOs and all installs produce working SSHD now.

Security profiles "Standard System Security Profile" and "C2S for CentOS Linux 7" can't be used in the CentOS 7.5.1804 installer. A bug causes the installer to require a separate partition for /dev/shm, which is not possible. RHBZ#1570956

So until CentOS 7.4, you couldn't use SCAP without having a broken system. That pretty much ensures it didn't get used.

For this server - being CentOS 7.5 - I'm pretty sure I chose the "Standard System Security Profile" thus becoming (in this instance) a cargo cult sysadmin. I selected an option I didn't fully understand the consequences of: It had nice sounding name, the description sounded kewl and it seemed this was the 'real' default. I remember looking up the definition and found it was a set of 14 very basic rules for sshd and firewalld security. What could go wrong?

What indeed.

It wasn't long before I found CVE-2018-1113. To quote the important part:

setup before version 2.11.4-1.fc28 in Fedora and Red Hat Enterprise Linux added /sbin/nologin and /usr/sbin/nologin to /etc/shells. This violates security assumptions made by pam_shells and some daemons which allow access based on a user's shell being listed in /etc/shells. Under some circumstances, users which had their shell changed to /sbin/nologin could still access the system.

The latest modification date on the CVE was Oct 31 - the same datestamp that I found on /etc/shells and /sbin/nologin. this looked curiously like a smoking gun.

At some point, this CVE translated into a Red Hat security advisory RHSA2018-3249.
Time to check the SCAP definitions:

This is curiously close to the date that ftp stopped working. This is also as far as I could get forensically. I'm not sure how the scap definitions updated, I assume that SCAP proactively fetches them and then applies them. This makes sense if SCAP is supposed to be dynamic responding to security advisories in real-time. That was the bit I overlooked in the description - if anything operates in real-time then it follows that it must have its own independent update mechanism.

After submitting my report I was asked if we should abandon SCAP. Again, the cargo cult administration reaction would be to say "yes". However, after some careful thinking I responded with "no". Applying the five whys leads to the appropriate conclusion:

Problem: FTP not working.
1st Why: Users could not login.
2nd Why: PAM authentication failing
3rd Why: /sbin/nologin not listed in /etc/shells
4th Why: Security policy update removed /sbin/nologin from /etc/shells
5th Why: Use of /sbin/nologin is subject to security vulnerability

The use of /sbin/nologin was my choice to prevent shell access, however my attempt to harden to system by denying ssh login for users was outside of the security policy. The problem was the use of /sbin/nologin in the first place (an old practice) rather than the preferred method of modifying the sshd config or placing a restriction within PAM.

The lesson (for me) is pretty sobering: If you intend to use best practice, use ONLY best practice, particularly with security policy. System modifications are often contain leaky abstractions and are only tested against best and default practice. If you choose to step outside that box, make damn sure you know and understand the system and its consequences fully.

And don't check boxes you don't fully understand. Even boxes that promise to magically summon system security from the sky.

Thursday, 6 December 2018

On Cargo-Cult System Administration

During World War II, many Pacific Islands were used as fortified air bases by Japanese and Allied forces. The vast amount of equipment airdropped into these islands meant a drastic change in the lifestyle of the indigenous inhabitants. There were manufactured goods, clothing, medicine, tinned food etc. Some of this was shared with the inhabitants, many of whom had never seen outsiders and for whom modern technology may as well have been magic - the purveyors of which seemed like gods.

After WWII finished, the military abandoned the bases and stopped dropping cargo. On the island of Tanna in Vanuatu sprang the "John Frum" cult. In an attempt to get cargo to be dropped by parachute or land by plane, natives carved headphones from wood, made uniforms, performed parade drills, built towers which they manned, waved landing signals on runways. They imitated in every way possible what they had observed the military doing in an attempt to "summon" cargo from the sky.

The practice wasn't limited to Vanuatu, many other pacific islands developed "Cargo Cults". This may be surprising to us - even amusing - but fundamentally it stems from a disconnect between observed practice and an understanding of how systems (in this case logistical systems) work. The native observer has no idea that the actions of the soldiers they saw don't cause the cargo to appear, they merely facilitate it.

Eric Lippert coined the phrase “cargo cult programming":

The cargo cultists had the unimportant surface elements right, but did not see enough of the whole picture to succeed. They understood the form but not the content. There are lots of cargo cult programmers –programmers who understand what the code does, but not how it does it. Therefore, they cannot make meaningful changes to the program. They tend to proceed by making random changes, testing, and changing again until they manage to come up with something that works.

The IT world is rife with Cargo Cult System Administrators. It usually manifests itself as an instinctive reaction to reboot a server as a first resort when anything goes wrong, without any effort to understand cause and effect and if that doesn't work, they start disabling firewalls, or running system cleaners. If they actually manage to fix a problem, they invent a reason with no evidence (It must have been a virus). If they are reporting to someone non-technical (which is usually the case) then this often gets accepted.

The big problem with "fixes" like this is they often cause collateral damage or degrade performance or system security.

How do these people keep jobs in IT? It's not surprising this level of ignorance exists amongst the lay users, it shouldn't exist in support staff. But if those in charge of IT hiring are not themselves experts, they don't know any better either and there seems to be this idea that a non-technical manager is perfectly capable of overseeing IT Departments.

With increasing complexity of systems, it is all the more important to hire people that actually know what they are doing. At the very least, you need people that don't think it is somehow mystical and have a grounding in the basic technologies - that at least understand binary and hexadecimal number systems; have a grounding in basic electronics and circuit theory; can follow and develop algorithms and cut code in at least one programming language; can formulate troubleshooting steps from a block diagram of the system; that understand the principle of abstraction and working through layers of abstraction.

A simple yet effective method of Root Cause Analysis of any problem is known as 5Ys (Five Whys) was developed by Sakichi Toyoda and adopted as best practice by the Toyota Motor Corporation. In it's simplest form, it involves ask an initial 'why' question and getting a simple answer. You then ask 'why' to the answer and work it out, next you ask 'why' to that answer until you've asked why five times and giving five increasingly low level answers. When you can no longer ask 'why' you have your root cause. An example of 5Ys is:

The vehicle will not start. (the problem)

Why? - The battery is dead. (First why)
Why? - The alternator is not functioning. (Second why)
Why? - The alternator belt has broken. (Third why)
Why? - The alternator belt was well beyond its useful service life and not replaced. (Fourth why)
Why? - The vehicle was not maintained according to the recommended service schedule. (Fifth why, a root cause)

An extension of this method is the Ishikawa diagram - also known as a fishbone diagram.

So, the next time you are tempted (or pressured) to reboot a server because it is not "working" perform 5Ys as starting point. Find out what isn't working and ask why. Look at all the possible things that could stop it from working and test them out one at a time. You may end up rebooting the server, but you will have a better understanding of how the system you are fixing works and maybe even put in preventative measures to stop this occurrence from happening again.

Sunday, 7 February 2016

LCA 2016 - Day 5

Last day of conference! This is generally considered to be the wind-down - where you acknowledge that your brain is probably too full to absorb anything really new. Think light dessert after a big meal.

Today's keynote was from Genevieve Bell. She started her talk by saying "I know I'm in Australia when I go to a conference that has a raffle."

Genevieve's talk was easily the most entertaining of the keynotes. She is an anthropologist that works for Intel. She was hired by Intel to help them understand two groups of people:

- Women
- ROW (Rest Of World - ie anything not American)

Describing herself as both an unreconstructed Marxist and a radical feminist, Genevieve discussed how we as a open-source community have a moral obligation to make a better world. There are a number of benefits to the open source paradigm, including facilitating innovation, sharing and re-use. The ‘open’ paradigm is increasingly extending to other areas such as open government, open culture, open health and open education.

5/1 - The eChronos Real-Time Operating System - Just what you want, when you want it by Stefan Götz

I've never worked with an RTOS before, so this workshop was a baptism of fire for me. I setup the emulator on my native OS rather than a vm, which leaves me wondering how much stuff I've broken that will need to be fixed. :-|

With not a lot of personal background I was still able to come out of the workshop with an appreciation of just how much is involved with an RTOS and the utility that eChronos offers. It also makes me appreciate just how much utility is sacrificed by including features into an OS that we could easily do without.

Since it was in the same room, I attended the Home Automation BOFS (Birds of a Feather) during lunch. Home automation is a nice idea and I have a level of admiration for those who pursue it, however the cost and the level of work required just to have programmable central lighting control and having a graphical display of your water usage is not worth it IMHO.

5/2 - Free as in cheap gadgets: the ESP8266 by Angus Gratton

After lunch I went to two embedded Linux talks. The first one was for the ESP8266 which is essentially a super-cheap wifi module that can be easily be connected to an Arduino or Raspberry Pi. So if you want your hardware project to talk wifi - this is the unit to get.

Angus covered its benefits and disadvantages and gave pointers for those wanting to work with this unit.

5/3 - Raspberry Pi Hacks by Ruth Suehle

Ruth is the co-author of the book of the same name as the talk. It re-whet my appetite to get one, particular when she described some of the projects people had completed with the Pi.

Closing Session - Lightning Talks

The closing session includes the five minute lightning talks. Although short, the speakers manage to put a lot into these punchy talks.

Steven Ellis - "A call to ARMS" was a pun plugging the ARM series of processors for NZOSS.
Geordie Millar - Explained is stackptr project: OS GPS map sharing project.
Katie McLaughlin - Discussed the #hatrack project
Christopher Neugebauer - Plugged the pyconaustralia in Aug 2016. There was also a plug for kiwipycon.
Cherie Ellis - Plugged Govhack 2016
Bron Gondwana - Discussed using JMAP as a better way to do email.
Martin Krafft - Discussed the curse of NIH and emphasised "Do one thing. Do it well."
Keith Packard - Demonstrated his low-cost random number generator that hooks into /dev/random

The conference finished with an impromptu performance entitled "I lied about being a Linux type".

Overall, this was a great conference! I have learnt a great deal from it and I look forward to the next one. I would recommend anyone with an interest in Linux or open source software to attend.

Saturday, 6 February 2016

LCA 2016 - Day 4

Day 4 opened with a keynote by Jono Bacon, director of community at Github. Jono spoke of the evolution of the Open Source and Linux communities moving towards what he called "Community 3.0" where the expectations of open-source infiltrate into society at large and become part of the "common core" of society. He stated that dignity is a fundamental human requirement and right and that dignity is a product of several factors:

Dignity, requires
Self Respect, which stems from a persons ability to
Contribute, which requires
Access

Jono described system 1 and system 2 thinking and outlined the SCARF model:

Status
Certainty
Autonomy
Relatedness
Fairness

The two golden rules are:

Accomplish goals indirectly
Influence behaviour with small actions

Community 3.0 = System 1&2 thinking + Behavioural patterns + Workflow + Experiences + Packaged guidance

I guess it goes without saying that I got a lot out of this keynote.

Day 4 also saw a marked improvement in the quality of the food offerings at morning tea. I think I ate 5 or 6 of these delectable goodies. I must learn to make them at home.

4/1 - Using Persistent Memory for Fun and Profit by Matthew Wilcox

The title of this talk sounded interesting, but I quickly worked out that there was very little I could gain from this. Persistant memory is memory that retains its state after powering off. Matthew works for Intel and they just so happen to be about to release 3D XPoint DIMMs that do this - however they will be expensive.

Applications must be written to take advantage of persistent memory - hence the need for intel to encourage developers to do so.

I couldn't help the feeling of deja vu with this talk. Persistent memory used to be a common thing: The PDP11 had it with core memory, my MicroBee had it with CMOS memory. We have come full circle.

4/2 - Hardware and Software Architecture of The Machine by Keith Packard

Another vendor talk, this one from Hewlett-Packard. This talk focused on The Machine - which I had never heard of, but apparently a lot of the delegates had.

Much of this talk was dedicated to the challenge of dealing with 320TB of RAm shared amongst several processors. The handle this a new paradigm was developed where memory is addressed in "books" instead of pages stored in "shelves". Memory is made available by the "Librarian".

In order to support the architecture of the machine, Linux needs to be modded to support:

Fabric attached Memory
File system abstractions
Librarian file system

4/3 - Tutorial: Hunting Linux malware for fun and $flags by Marc-Etienne M.Léveillé

After lunch was a gruelling workshop where each participant was given a virtual machine infected with malware with the instructions to detect and defuse it and see how many 'flags' we could capture. Somehow we were meant to do this while listening to his talk.

These sort of workshops are generally bad for my ego. I like to think I'm pretty good at this sort of stuff, but once you're shoved in a room full of people as good as or better than you, you start to feel like a clueless noob. I eventually captured five flags of the ten available flags but the malware was still persistent on my machine and I had to resort to the cheat notes. This is where I found out that the email sending was made persistent through ssl injection.

I would have liked to have more time to study and understand the mechanisms. This was certainly a valuable tutorial with direct application to the real world.

4/4 - edlib - because one more editor is never enough by Neil Brown

While admitting that the last thing Linux needs is another editor, Neil explained his justification for doing so. He described the deficiencies of current editors from the Model-View-Controller perspective and detailed how his new editor aimed to overcome them. It was enough to make me wish it wasn't in alpha.

https://github.com/neilbrown/edlib

4/5 - Playing to lose: making sensible security decisions by assuming the worst by Tom Eastman

In a classic case of leave the best 'til last, Tom described how security is enhanced by assuming the worst. He started by describing the potential threats:

Script kiddies, all the time in the world, in it for the lulz
Organised criminals
Former employees (top threat)
Hacktivists
Nation-state actors

Tom then went on to explore each of the 'attack surfaces' of an online presence in detail:

Web server
App server
Database
Front-end interface
Infrastructure

I took a several pages of notes from this excellent talk. His key recommendations are:

White-list input validation on all user-generated input
Escape all data appropriately for display
Mitigate cross-site scripting using Content Security Policy. Key: ensure inline javascript is never executed.
Log and check CSP violation reports.

Friday, 5 February 2016

LCA 2016 - Day 3

With the intensity of the Miniconfs over, the conference settled into the streams. This is where people chop and change to whatever talk appeals to them the most. In my case I concentrated on the security topics and hands-on workshops.

The day began with the second keynote speaker for the week (Catarina Mota) who spoke on the topic "Life is better with Open Source". Good talk, but not as good as yesterday's. Her main emphasis was on open-sourced hardware.

3/1- Using Linux features to make a hacker's life hard by Kayne Naughton

Kayne's talk emphasised the increase of Advanced Persistent Threats (APT) which following a distinct pattern of infiltration:

Reconnaissance
Weaponisation
Delivery
Exploitation
Installation
Command and Control
Actions on Objectives

Successful APTs may continue for years if undetected. The six D's of mitigation are:

Detect
Deny
Disrupt
Degrade
Deceive
Destroy

Kayne discussed each of the steps in detail with examples.

https://goo.gl./R6xzhX

3/2 - How To Write A Linux Security Module That Makes Sense For You by Casey Schaufler

The second security talk was highly specialised and targeted towards kernel module developers. Since I am unlikely to write a kernel module in the near future, this was more an information session for me. However I did learn the difference between major and minor security modules.

After lunch I dived into the first of two double-session workshops.

3/3 - Identity Management with FreeIPA by Fraser Tweedale

The first workshop was on FreeIPA. During the workshop we got to:
- Install a FreeIPA server and replica
- Enrol client machines in the domain
- Create and administer users
- Manage host-based access control (HBAC) policies
- Issue X.509 certificates for network services
- Configure a web server to use FreeIPA for user authentication and
access control

It's definitely preferable to using Active Directory or OpenLDAP or (shudder) NIS.

During the workshop we used vagrant with virtualbox. I had never used Vagrant before and was very impressed. The workshop listed Federation as one of the objectives, but we didn't have time to cover that.

I wouldn't class FreeIPA as 'true' Identity Management as it doesn't support connectors, data pumping or password synch - however it certainly does replication and federation, so that's a big plus.

3/4 - Packets don't lie: how can you use tcpdump/tshark (wireshark) to prove your point. by Sergey Guzenkov

The final workshop of the day was on wireshark. Now I've been using wireshark for years, so I was looking forward to something I had not seen before. I wasn't disappointed.

It was almost impossible to keep up with the lightning pace of this workshop. We quickly covered the basics of wireshark and tcpdump and launched straight into capturing SSL keys and decrypting SSL packets.

We also covered many of the little used switches on both tshark and tcpdump and how they can be used to generate statistics for traffic reports. We also used mergecap, capinfos and dumpcap tools.

LCA 2016 - Day 2

Day 2 of LCA2016 kicked off with the first of four keynotes for the conference delivered by George Fong, President of Internet Australia. He surprised everyone by actually giving a keynote rather than a shameless sponsor promotion. The keynote was entitled "The Cavalry's not coming... We are the Cavalry" which was subtitled "The challenges of the Changing Social Significance of the Nerd."

The main thrust of the keynote was highlighting the insatiable greed for control over technology as exhibited by Governments and legislators - particular when they have little to no understanding of the technologies they are trying to legislate. Particularly damaging is Governments desire to hamstring encryption technology and impose export controls on intangibles and its effect on open source. George emphasised the need to communicate technical concepts to lay people in language they can understand.

Day 2 was also the second day on the miniconfs. For me, that meant the sysadmin miniconf. This one did not have the structure exhibited by he opencloud symposium - each talk was an island of knowledge and there were thicknesses and thinnesses. The talks were also shorter - meaning more of them. The sysadmin miniconf has its own page, so you could ignore this blog entry completely and go there.

2/1 Is that a data-center in your pocket? by Steven Ellis

Subtitled "There will be dragons" rather than the predictable "...or are you pleased to see me."
Steven provided a walk-through on how to create a portable, virtualised cloud infrastructure for demo, training and development purposes. This talk was heavy on detail and I found myself wanting to explore this more at a later date. He utilised a USB3 attached SSD drive connected to an ARM Pine64. The setup utilised nested virtualisation, thin LVM and docker.

According to Steve, the "cloud" will very soon be mostly ARM64 - so it's time to prepare for that. He also demonstrated how UEFI can be used to secure boot virtual machines.

2/2 Revisiting Unix principles for modern systems automation by Martin Kraff

Martin highlighted the fact that in the transition from Unix to Linux, somehow we forgot the habits born from Unix administration - in particular, we forgot about system automation, to whit:

Monitoring
Data collection
Policy enforcement

Martin worked through scripts available at https://github.com/madduck/retrans

2/3 A Gentle Introduction to Ceph by Tim Serong (Suse)

I didn't get a lot out of this talk, other than becoming aware that Ceph was a filesystem popular with OpenCloud. His slides are here.

2/4 Keeping Pinterest Running by Joe Gordon

Joe talked about the challenges and differences in supporting a service as opposed to supporting a piece of software. His basic description is that it's like changing tyres whilst driving at 100MPH. The differences include:

stable branches
no drivers and configurations
no support matrix
dependency versions
dev support their own service
testing against prod traffic

One thing that really interested me is their use of a "Run Book" for the on-call support team. All recent changes are documented in the Run Book against anything it could potentially affect and who to contact about those changes. If on-call support has to respond to a problem, they consult the Run Book first.

In addition to a staging environment, they also have what they call the "canary" environment - akin to the canary in a coal mine metaphor. However, Joe said it was more akin to a rabbit in a sarin gas plant metaphor (insert chuckles).

Their dev->prod cycle looks like:
dev->staging->canary->prod

The staging system uses dark traffic, however the canary system operates of a minimal set of live traffic. If problems occur at any point, they rollback and conduct a blameless post-mortem. Joe emphasised that the blameless component was the most critical.

Before deployment, they conduct a pre-mortem covering:

- Dependencies

- Define an SLA

- Alerting

- Capacity Planning

- Testing

- On call rotation

- Decider to turn feature off if needed

- Incremental launch plan

- Rate limiting

https://github.com/pintrest

2/5 Site Reliability Engineering at Dropbox by Tammy Butow

Tammy's central message was on developing self-healing systems through scripting and auto-remediation. For everything you think of that can go wrong, rather than just logging and crashing, run a script to fix the problem. The motto for their team is KTLO - Keep the lights on.

She also emphasised the need for a "Captain's Log" - which is a log of every on-call alert. Also for cross team disaster recovery testing.

2/6 Network Performance Tuning by Jamie Bainbridge

This talk was more an in-depth tutorial on how to tune network performance of your system as well as diagnose any problems due to the network. It was quite fast-paced, his slides are here.

2/7 'Can you hear me now?' Networking for containers by Jay Coles

I felt a little lost in this talk. This was part 3/3 in a series of talks by Jay on containerisation. As mentioned before, I have neglected containers - something I need to remediate as it seems everyone has embraced them. Much of the material for this talk is available here.

2/8 Pingbeat: y'know, for pings! by Joshua Rich

This was a great talk! Josh gave quick overview of ICMP ping and then introduced Pingbeat, a small open-source program written in Go that can be used to record pings to hundreds or thousands of hosts on a network.
Pingbeats power lies in its ability to write the ping response to Elasticsearch, an open-source NoSQL-like data-store with powerful, built-in search and analytics. Combined with Kibana, a web-based front-end to Elasticsearch, you get an interactive interface to track, search and visualise your network health in near real-time.

2/9 The life of a sysadmin in a research environment by Eric Burgueno

Being the ninth talk of the day, I kinda snoozed in this one. I didn't find it particularly useful or interesting hearing about the challenges of supporting an IT system used by research scientists.

2/10 Creating bespoke logging systems and dashboards with Grafana, in fifteen minutes by Andrew McDonnell

Grafana is an open source web charts dashboard. It can be configured to use a variety of backend data stores.

Andrew gave a live install, config and run demonstration of Grafana, starting from a fresh Ubuntu 14 VM with Docker (again!) where he installed and setup Graphite using Carbon to log both host CPU resources and MQTT feeds and created a custom dashboard to suit.

2/11 Order in the chaos: or lessons learnt on planning in operations by Peter Hall (realestate.com.au)

This was another talk on what it's like to support a continuously available service. Things to gain from this included:

- Fixed time iterations

- Plan the scope the known work for the next 2-3 weeks

- Leave sufficient slack for urgent work

- Be realistic

Interruptions:

- Assign team members to dev teams

- Have a rotating “ops goal keeper” with a day pager who is free from other work

- Have developers on pager as well. This helps in closing the feedback loop so that they are aware of issues in production

2/12 From Commit to Cloud by Daniel Hall

This talk was focused on leveraging the benefits of microinstances when managing cloud based services and infrastructure. Deployments should be:

- Fast (10 minutes)

- Small (ideally a single commit, aware of whole change)

- Easy (as little human involvement as possible, minimise context switching, simple to understand)

This leaves less to break, easier rollbacks and allows the dev team to focus on just one thing at a time rather than a multitude of tracked changes. The basic idea is that deployments should be frequent and nobody should be afraid to deploy.

In the setup Daniel works with, they have:

- 30 separate microservices

- 88 docker machines across 15 workers

- 7 deployments to prod each working day!

- Only 4 rollbacks in 1.5 years

Their deployment steps are:

write some code
push to git reporting, build app
automated tests run
app is packaged
deployed to staging
test in staging
approve for prod (single click)
deploy to production

2/13 LNAV

The final talk was a 10 minute adhoc one discussing lnav as a replacement to using tail -f /var/log/syslog when looking at the systemlog. I am fully converted to this tool and will be using it everywhere from now one. It uses static libraries, so you can simply copy it from one system to another as a standalone binary.

Pages