Understanding the performance overhead of encryption and it's upside

Every modern application has a requirement for encrypting certain amounts of data. The traditional approach has been either relying on some sort of transparent encryption (using something like encryption at rest capabilities in the storage, or column/field level encryption in database systems). While this clearly minimizes the requirement for encryption within the application, it doesn’t secure the data from attacks like a SQL Injection, or someone just dumping data since their account had excessive privileges, or though exposure of backups.

In comes the requirement of doing encryption at the application level, with of course the expected complexity of doing a right implementation at the code level (choosing the right cyphers, encryption keys, securing the objects), and securing and maintaining the actual encryption keys, which more often than not, end in version control, or in some kind of object store subject to the usual issues.

Traditionally, there has been different approaches to secure encryption keys: An HSM could be used, with considerable performance penalty. An external encryption system, like Amazon’s KMS, or Azure Key Vault, or Google KMS’s, where the third party holds your encryption key. Ultimately, HashiCorp’s Vault offers it’s own Transit backend, which allows (depending on policy) to offload encryption/decryption/hmac workflows, as well as signing and verification, abstracting the complexity around maintaining encryption keys, and allowing users and organizations to retain control over them inside Vault’s cryptographic barrier. As a brief summary of Vault’s capabilities, when it comes to encryption as a service, we can just refer to Vault’s documentation.

The primary use case for transit is to encrypt data from applications while still storing that encrypted data in some primary data store. This relieves the burden of proper encryption/decryption from application developers and pushes the burden onto the operators of Vault. Operators of Vault generally include the security team at an organization, which means they can ensure that data is encrypted/decrypted properly. Additionally, since encrypt/decrypt operations must enter the audit log, any decryption event is recorded.

The objective of this article, however, is not to explain the capabilities of Vault, but rather to do an analysis of what’s the overhead of using Vault to encrypt a single field in the application.

For that purpose, a MySQL data source is used, using the traditional world database (with about 4000 rows), and the test being run will effectively launch a single thread per entry, that will encrypt a field of such entry, and then persist it to Amazon S3.

This test was run on an m4.large instance, in Amazon, running a Vault Server, with a Consul backend, a MySQL server, and the script taking the metrics, all on the same system. It is inspired on a traditional Big Data use case, where information needs to be persisted to some sort of object store for later processing. The commented Python function being executed in each thread is as follows:

def makeJson(vault, s3, s3_bucket, ID, Name, CountryCode, District, Population):
        # Take the starting time for the whole function
        tot_time = time.time()
        # Take the starting time for encryption
        enc_time = time.time()
        # Base64 a single Value
        Nameb64 = base64.b64encode(Name.encode('utf-8'))
        # Encrypt it through Vault
        NameEnc = vault.write('transit/encrypt/world-transit', plaintext=bytes(Nameb64), context=base64.b64encode('world-transit'))
        # Calculate how long it took to encrypt
        eetime = time.time() - enc_time
        # Create the object to persist and convert it to JSON
        Cityobj = { "ID": ID, "Name": NameEnc['data']['ciphertext'], "CountryCode": CountryCode, "District": District, "Population": Population }
        City = json.dumps(Cityobj)
        filename = "%s.json" % ID
        # Take the starting time for persisting it into S3, for comparison
        store_time = time.time()
        # Persist the object
        s3.put_object(Body=City, Bucket=s3_bucket, Key=filename)
        # Calculate how long it took to store it
        sstime = time.time() - store_time
        # Calculate how long it took to run the whole function
        tttime = time.time() - tot_time
        print("%i,%s,%s,%s\n" % (int(ID), str(sstime), str(eetime), str(tttime)))

This would render a single line of a CSV, per thread, that later was aggregated in order to analyze the data. The average, minimum and median values are as follows: Avg/Min/Med

As for concurrency, this is running 4 thousand threads that are being instantiated on a for loop. According to this limited dataset (about 4000 entries) we’re looking at a 5% ~ 10% overhead, in regards to execution time. Data Points

It’s worth noting that during the tests Vault barely break a sweat, Top reported it was using 15% CPU (against 140% that Python was using). This was purposely done in a limited scale, as it wasn’t our intention to test how far could Vault go. Vault Enterprise supports Performance replication that would allow it to scale to fit the needs of even the most demanding applications. As for the development effort, the only complexity added would be adding two statements to encrypt/decrypt the data as the Python example shows. It’s worth noting that Vault supports convergent encryption, causing same encrypted values to return the same string, in case someone would require to look for indexes in WHERE clauses.

Using this pattern, along with the well known secret management features in Vault, would help mitigate against the Top 10 database attacks as documented by the British Chartered Institute for IT:

  1. Excessive privileges: Using transit, in combination of Vault’s Database Secret Backend, an organization can ensure that each user or application get’s the right level of access to the data, which in its own can be encrypted, requiring further level of privilege to decode it.

  2. Privilege abuse: Using transit, the data obtained even with the right privileges is encrypted, and potentially requiring the right “context” to decrypt it, even if the user has access to.

  3. Unauthorized privilege elevation: Much like in the cases above, Vault can determine what is the right access a user gets to a database, effectively terminating the “Gold credentials” pattern and encrypting the underlying data from operator access.

  4. Platform vulnerabilities: Even if the platform is vulnerable, the data would be secure.

  5. SQL injection: As data is not transparently encrypted, a vulnerable application would mostly dump obfuscated data, that can be re-wrapped upon detection of a vulnerability to an updated encryption key.

  6. Weak audit: Vault audits encryption and decryption operations, effectively creating an audit trail which would allow to pinpoint exactly who has access to the data.

  7. Denial of service: Through Sentinel, our policy as code engine, Vault can evaluate traffic patterns through rules and deny access accordingly.

  8. Database protocol vulnerabilities: As before, even if the data is dumped, it wouldn’t be transparently decrypted.

  9. Weak authentication: Using Vault’s Database secret backend, would generate short lived credentials which can be revoked centrally, and have the right level of complexity.

  10. Exposure of backup data: Backups would be automatically encrypted, just like the underlying data.


About the author: Nicolas Corrarello is the Regional Director for Solutions Engineering @ HashiCorp based out of London.

Reading Vault Secrets in your Jenkins pipeline

The Jenkins credential store in most enterprises is becoming a potential attack vector. It’s generally filled with long lived credentials, sometimes even to production systems.

In comes Hashicorp’s Vault, a Secret Management solution that enables the secure store of secrets, and dynamic generation of credentials for your job. Looks like a great match right? Look at the demo, certainly looks promising (specially with Jenkins beautiful new BlueOcean UI):

Jenkins Demo

Interested? Let’s dive into it:

What is Hashicorp Vault?

Quite simply, is a tool for managing secrets. What’s really innovative about Vault is that it has methods for establishing both user and machine identity (through Auth Backends), so secrets can be consumed programatically. Identity is ultimately established by a (short lived) token. It also generates dynamic secrets on a number of backends, such as Cassandra, MySQL, PostgreSQL, SQL Server, MongoDB, etc. … I’m not going to cover here how to configure Vault, feel free to head out to the Vault documentation. For all intent and purposes of this document, you can download Vault and run it in “dev mode” for a quick test:

$ vault server -dev
WARNING: Dev mode is enabled!

In this mode, Vault is completely in-memory and unsealed.
Vault is configured to only have a single unseal key. The root
token has already been authenticated with the CLI, so you can
immediately begin using the Vault CLI.

The only step you need to take is to set the following
environment variable since Vault will be talking without TLS:

    export VAULT_ADDR='http://127.0.0.1:8200'

The unseal key and root token are reproduced below in case you
want to seal/unseal the Vault or play with authentication.

Unseal Key: 2252546b1a8551e8411502501719c4b3
Root Token: 79bd8011-af5a-f147-557e-c58be4fedf6c

==> Vault server configuration:

         Log Level: info
           Backend: inmem
        Listener 1: tcp (addr: "127.0.0.1:8200", tls: "disabled")

Please don’t run a Vault cluster in dev mode on production, feel free to reachout through the usual means if you need help.

AppRole

AppRole is a secure introduction method to establish machine identity. In AppRole, in order for the application to get a token, it would need to login using a Role ID (which is static, and associated with a policy), and a Secret ID (which is dynamic, one time use, and can only be requested by a previously authenticated user/system. In this case, we have two options:

  • Store the Role IDs in Jenkins
  • Store the Role ID in the Jenkinsfile of each project

So let’s generate a policy for this role:

$ echo 'path "secret/hello" {
  capabilities = ["read", "list"]
}' | vault policy-write java-example -
Policy 'java-example' written.

In this case, tokens assigned to the java-example policy would have permission to read a secret on the secret/hello path.

Now we have to create a Role that will generate tokens associated with that policy, and retrieve the token:

$ vault write auth/approle/role/java-example \
> secret_id_ttl=60m \
> token_ttl=60m \
> token_max_tll=120m \
> policies="java-example"
Success! Data written to: auth/approle/role/java-example
$ vault read auth/approle/role/java-example
Key                 Value
---                 -----
bind_secret_id      true
bound_cidr_list
period              0
policies            [default java-example]
secret_id_num_uses  0
secret_id_ttl       3600
token_max_ttl       0
token_num_uses      0
token_ttl           3600
$ vault read auth/approle/role/java-example/role-id
Key     Value
---     -----
role_id 67bbcf2a-f7fb-3b41-f57e-88a34d9253e7

Note that in this case, the tokens generated through this policy have a time-to-live of 60 minutes. That means that after an hour, that token is expired and can’t be used anymore. If you’re Jenkins jobs are shorted, you can adjust that time to live now to increase security.

Let’s write a secret that our application will consume:

$ vault write secret/hello value="You've Succesfully retrieved a secret from Hashicorp Vault"
Success! Data written to: secret/hello

Now Jenkins will need permissions to retrieve Secret IDs for our newly created role. Jenkins shouldn’t be able to access the secret itself, list other Secret IDs, or even the Role ID.

$ echo 'path "auth/approle/role/java-example/secret-id" {
  capabilities = ["read","create","update"]
}' | vault policy-write jenkins -

And generate a token for Jenkins to login into Vault. This token should have a relatively large TTL, but will have to be rotated:

$ vault token-create -policy=jenkins
Key             Value
---             -----
token           de1fdee1-72c7-fdd0-aa48-a198eafeca10
token_accessor  8ccfb4bb-6d0a-d132-0f1d-5542139ec81c
token_duration  768h0m0s
token_renewable true
token_policies  [default jenkins]

In this way we’re minimizing attack vectors:

  • Jenkins only knows it’s Vault Token (and potentially the Role ID) but doesn’t know the Secret ID, which is generated at pipeline runtime and it’s for one time use only.

  • The Role ID can be stored in the Jenkinsfile. Without a token and a Secret ID has no use.

  • The Secret ID is dynamic and one time use only, and only lives for a short period of time while it’s requested and a login process is carried out to obtain a token for the role.

  • The role token is short lived, and it will be useless once the pipeline finishes. It can even be revoked once you’re finished with your pipeline.

Jenkins pipeline and configuration

A full example for the project is available here. The Jenkinsfile will be using is this one:

pipeline {
  agent any
  stages { 
    stage('Cleanup') {
      steps {
        withMaven(maven: 'maven-3.2.5') {
          sh 'mvn clean'
        }
        
      }
    }
    stage('Test') {
      steps {
        withMaven(maven: 'maven-3.2.5') {
          sh 'mvn test'
        }
        
      }
    }
    stage('Compile') {
      steps {
        withMaven(maven: 'maven-3.2.5') {
          sh 'mvn compile'
        }
        
      }
    }
    stage('Package') {
      steps {
        withMaven(maven: 'maven-3.2.5') {
          sh 'mvn package'
        }
        
      }
    }
    stage('Notify') {
      steps {
        echo 'Build Successful!'
      }
    }
    stage('Integration Tests') {
      steps {
      sh 'curl -o vault.zip https://releases.hashicorp.com/vault/0.7.0/vault_0.7.0_linux_arm.zip ; yes | unzip vault.zip'
        withCredentials([string(credentialsId: 'role', variable: 'ROLE_ID'),string(credentialsId: 'VAULTTOKEN', variable: 'VAULT_TOKEN')]) {
        sh '''
          set +x
          export VAULT_ADDR=https://$(hostname):8200
          export VAULT_SKIP_VERIFY=true
          export SECRET_ID=$(./vault write -field=secret_id -f auth/approle/role/java-example/secret-id)
          export VAULT_TOKEN=$(./vault write -field=token auth/approle/login role_id=${ROLE_ID} secret_id=${SECRET_ID})
          java -jar target/java-client-example-1.0-SNAPSHOT-jar-with-dependencies.jar 
        '''
        }
      }
    }
  }
  environment {
    mvnHome = 'maven-3.2.5'
  }
}

Lot’s of stuff here, including certain Maven tasks, but we will be focusing on the Integration Tests stage (the last one), what we’re doing here is:

  • Downloading the Vault binary (in my case the ARM linux one, I’ll brag about that in a different blog post :D)

  • Reading credentials from the Jenkins credential store. In this case I’m storing both the ROLE_ID and the VAULT_TOKEN I’ve generated for Jenkins. As mentioned before if you want to split them for more security, you can just use the ROLE_ID as a variable in the Jenkinsfile.

  • I’m doing a set +x to disable verbosity in the shell in order not to leak credentials, although even with -x, in this case I’m just showing the Secret ID (which is useless after I’ve already used it), and the VAULT_TOKEN that I’m going to use to consume credentials, which is short lived, and can be revoked at the end of this runtime just adding a vault token-revoke command.

  • I’m retrieving a SECRET_ID using Jenkins administrative token (that I’ve manually generated before, that’s the only one that would be relatively longed lived, but can only generate SECRET_IDs).

  • I’m doing an AppRole login with the ROLE_ID and the SECRET_ID, and storing that token (short lived).

  • My Java Process is reading VAULT_TOKEN and VAULT_ADDR to contact Vault and retrieve the secret we stored.

The VAULT_TOKEN (And optionally ROLE_ID) are stored in the Credential store in Jenkins:

Jenkins Cred Store

Who has time for testing

And now for a short update, here’s my PuppetConf presentation of this year.

PuppetConf 2016: Puppet on Windows

Don't DROWN on OpenSSL

Yet another OpenSSL vulnerability in the loose. While the vendors release fixes, Puppet to the rescue!. Let’s start by disabling those weak ciphers.

For the apache https server, you can use the puppetlabs-apache module to disable weak ciphers:

class { 'apache::mod::ssl':
  ssl_cipher           => 'EECDH+ECDSA+AESGCM EECDH+aRSA+AESGCM EECDH+ECDSA+SHA384
                           EECDH+ECDSA+SHA256 EECDH+aRSA+SHA384 EECDH+aRSA+SHA256
                           EECDH+aRSA EECDH EDH+aRSA !aNULL !eNULL !LOW !3DES !MD5
                           !EXP !PSK !SRP !DSS !EXPORT',
  ssl_protocol         => [ 'all', '-SSLv2', '-SSLv3' ], #Default value for the module
  ssl_honorcipherorder => 'On', #Default value for the module
}

More information on https://forge.puppetlabs.com/puppetlabs/apache/readme#class-apachemodssl

Using Postfix? No problem, the camptocamp-postfix module can help you there:

postfix::config {
    'smtpd_tls_security_level':            value => 'secure';
    'smtpd_tls_mandatory_protocols':       value => '!SSLv2, !SSLv3';
    'smtpd_tls_mandatory_exclude_ciphers': value => 'aNULL, MD5'
}

More information on https://forge.puppetlabs.com/camptocamp/postfix

How about IIS 7? We can use the puppetlabs-registry module to disable weak ciphers:

class profile::baseline {
  registry_value { 'HKLM\System\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\SSL 2.0\Server\Enabled':
    ensure => present,
    type   => dword,
    data   => 0x00000000,
  }

  registry_value { 'HKLM\System\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\SSL 3.0\Server\Enabled':
    ensure => present,
    type   => dword,
    data   => 0x00000000,
  }
}

More information on https://forge.puppetlabs.com/puppetlabs/registry

Extending your Puppet Language Dictionary for Application Orchestration

Time to extend your Puppet Language skills! The new features for application orchestration introduce a number of language constructs you’ll want to know about if you’re planning to describe your applications and supporting infrastructure in the Puppet language.

The Environment Service Resource Type

If you haven’t worked with Puppet types and providers before, I’d strongly suggest you read the documentation on custom types and providers (available here https://docs.puppetlabs.com/guides/custom_types.html and here https://docs.puppetlabs.com/guides/provider_development.html, respectively).

It is worth noting that, unlike your traditional Puppet types and providers, an environment service resource type is far simpler to develop, as you’ll see in the first example below.

The environment service resource is the key to defining the relationship between different software layers. It effectively extends a traditional Puppet type (like file, services, or packages) to describe application services (an SQL database, web server, API layer or a microservice, for example). Like Puppet node types, the environment service resource can (optionally) have a provider, which in this case implements a number of service-related methods — for instance, how to check if an environment service produced by an application component is actually ready for work before continuing the deployment.

The environment service type looks pretty much like any other Puppet type, but it has the property of being a capability. A very simple example could be an SQL resource, as described below:

    Puppet::Type.newtype :sql, :is_capability => true do
      newparam :name, :namevar => true
      newparam :dbname
      newparam :dbhost
      newparam :dbpass
      newparam :dbuser
    end

What I’m doing here is telling the Puppet resource abstraction layer (or rather the service abstraction layer) that I have modeled an SQL database that can be produced by a server (managed with Puppet), for use by one or more servers that need an SQL database. You can think of it as a contract between the servers on your network: What information do they need to exchange in order to interoperate?

The Application Layers

The component in your applications should be iterable — that is to say, you should be able to deploy multiple instances of your layers (multiple web servers, for example). What better way to do that than the way you’ve been doing it historically, with defined types?

Here’s where you can leverage your existing modules, profiles and roles. Remember, Puppet is about code reusability; you don’t need to reinvent the wheel every time you do something.

Just to show you a quick example, here’s a define type for my database layer:

    define apporch_blog::db (
      $dbuser,
      $dbpass,
      $host = $::fqdn,
      ){
       $override_options = {
         'mysqld' => {
           'bind-address' => '0.0.0.0',
         }
       }
        firewall { '101 Allow Connections to Database':
          dport    => 3306,
          proto    => 'tcp',
          action  => 'accept',
        }
        # For simplicity, I’m declaring the mysql::server class here but you’d normally do this outside of the define (Preferably in Hiera).

        class {'::mysql::server':
          root_password    => 'whatever',
          override_options => $override_options,
        }
        mysql::db { $name:
          user     => $dbuser,
          password => $dbpass,
          host     => '%',
          grant    => ['ALL PRIVILEGES'],
        }
      }
    }

As you can see, this is pretty much standard; I’m creating a new type that defines a database. Now going back to the environment service resource I described before, I could potentially produce an SQL resource out of this database, so let’s modify the code to do that:

    define apporch_blog::db (
      $dbuser,
      $dbpass,
      $host = $::fqdn,
      ){
       $override_options = {
         'mysqld' => {
           'bind-address' => '0.0.0.0',
         }
       }
       firewall { '101 Allow Connections to Database':
         dport    => 3306,
         proto    => 'tcp',
         action  => 'accept',
       }
       # It is a bad idea to instantiate a class is this way, since if I'm trying to apply the db class to multiple nodes the catalog compilation may fail, but for simplicity:
       class {'::mysql::server':
         root_password    => 'whatever',
         override_options => $override_options,
       }
       mysql::db { $name:
         user     => $dbuser,
         password => $dbpass,
         host     => '%',
         grant    => ['ALL PRIVILEGES'],
       }
     }
     Apporch_blog::Db produces Sql {
       dbuser => $dbuser,
       dbpass => $dbpass,
       dbhost => $host,
       dbname => $name,
     }

Now I’ve told the Puppet parser that this particular define type is producing a resource, and I’m mapping the parameters of this type to the parameters of this resource.

Just as this define type produces the resource, I’ve created another defined type, to consume it:

    define apporch_blog::web (
      $webpath,
      $vhost,
      $dbuser,
      $dbpass,
      $dbhost,
      $dbname,
      ) {
        package {['php','mysql','php-mysql','php-gd']:
          ensure => installed,
        }
        firewall { '100 Allow Connections to Web Server':
          dport    => 80,
          proto    => 'tcp',
          action  => 'accept',
        }

        include ::apache
        include ::apache::mod::php
        apache::vhost { $vhost:
          port    => '80',
          docroot => $webpath,
          require => [File[$webpath]],
        }

        file { $webpath:
          ensure => directory,
          owner => 'apache',
          group => 'apache',
          require => Package['httpd'],
        }
        class { '::wordpress':
          db_user        => $dbuser,
          db_password    => $dbpass,
          db_host        => $dbhost,
          db_name        => $dbname,
          create_db      => false,
          create_db_user => false,
          install_dir    => $webpath,
          wp_owner       => 'apache',
          wp_group       => 'apache',
        }
      }
    Apporch_blog::Web consumes Sql {

    }

Please note two things in this particular code extract:

  • I’m declaring variables in the defined type that are going to be filled with parameters from the environment service resource.
  • I’m also declaring that this defined type consumes the environment service resource, but I’m not mapping any variables. That’s because I’m using the same names for the variables.

The Application Construct

Now let’s glue everything together. The application construct allows us to model the application and declare the dependencies within the layer:

    application apporch_blog (
      $dbuser  = 'wordpress',
      $dbpass  = 'w0rdpr3ss',
      $webpath = '/var/www/wordpress',
      $vhost   = 'wordpress.puppetlabs.demo',
      ) {
        apporch_blog::db { $name:
          dbuser => $dbuser,
          dbpass => $dbpass,
          export => Sql[$name],
        }
        apporch_blog::web { $name:
          webpath => $webpath,
          consume => Sql[$name],
          vhost   => $vhost,
        }
    }

As you can see, it’s here where I’m actually defining the variables to configure the services and create the environment service resource.

The Site Construct

As with your layers, you should be able to deploy your application in multiple instances. That’s where the site construct comes into play. In my particular example, I’m defining it in the site.pp file for the particular environment. But here’s where you actually instantiate your application and glue it to the specific nodes (using the hostnames for the specific nodes):

    site {
      apporch_blog { 'example':
          nodes           => {
            Node['db.demo'] => [ Apporch_blog::Db[ 'example' ]],
            Node['www.demo'] => [ Apporch_blog::Web[ 'example' ]],
        }
      }
    }

In all fairness, I had the unfair advantage of using the Puppet language and modules for quite a while now, but as you can see, once again, the code is human readable and draws from the existing modules on the Puppet Forge. As you can imagine, while this is an extremely simple example, the possibilities are endless:

  • Think of your microservices running in Docker containers — this is a great way of actually tying the whole application together. See an example here.
  • Load balancing? Sure, your web servers can produce resources that are consumed by the load balancer.
  • You can easily replace the nodes in your application, assuming you have current backups. You might be able to re-deploy your database and have Puppet automatically update your web servers.

Now that you have an idea on how the language looks like, you should read the full documentation. It’s worth noting that the Puppet Orchestrator, which is the set of tools that will actually make the orderly deployment possible, is not referenced here. But you can learn about it from the documentation linked above and below.

[*] This article was originally authored for the Puppet Labs Blog (http://blog.puppetlabs.com).

Who has time for testing

Yeah, I know I’ve neglected my Blog, but hey, I’ve been busy!.

Anyway, here’s the video of my PuppetConf presentation on Testing and CI. Enjoy

PuppetConf 2015: Who has time for testing

Automating Networks

I did a short presentation on how to automate the configuration of Arista devices with Puppet on the Arista meetup. Luckily I recorded it, for those of you who weren’t lucky enough to see it live. Video of the screencast is below.

Automating Networks - Puppet with Arista Switches

Creating an API with Sinatra and wrapping Puppet Code around it

Another week, another commute to a different country. You do have to love Europe!. Anyway, with an hour on my hands, I starting thinking back to, well, probably the question I get asked the most: “What can you do with Puppet?”. There is probably no short or straight answer to it, so I tend to start enumerating anything that might be relevant, and lately I’ve been finishing up with, “and also, if you have anything that expose resources through an API, well you can wrap Puppet Code around it!”. An excellent example of this, could be Gareth’s AWS module. It is also, too complicated to illustrate a principle, since it’s a fairly complex module, with quite a bit of Ruby code. For those who don’t know me, I don’t do Ruby… well… kind of.

Ruby does have a number of brilliant gems (see what I did there?), and since I have been hearing about Ruby, the one that always caught my eye, was Sinatra.

Not this Sinatra. Thanks wikimedia commons for the image!

It is one of those things that look amazingly simple, that are really powerful, and of course, I always wanted to play around with. With all these in mind, I decided to write an extremely simple REST API, and a Puppet Module around it with very simple curl calls, to demo how this would look like, and of course, I did it my way.

The absolutely basic Sinatra syntax is:

httpverb '/httppath' do
  rubycode
end

So let’s look at a commented example:

# HTTP GET verb to retrieve a specific item. Returns 404 if item is not present.
# HTTP Verb and Path definition
get '/items/:key' do
  # Check if the string is actually present in the file and give a quick message in case this is being opened with a browser. Sinatra returns 200 OK as default.
  unless File.readlines("file.out").grep(/#{params['key']}/).size == 0 then
    "#{params['key']} exists in file!"
  else
  # or return the appropiate http code.
    status 404
  end
end

So with a few bits and bolts of Ruby code here and there, here’s my full API example, with three basic verbs.

#Extremely Simple API with three basic verbs.
require 'sinatra'
require 'fileutils'
require 'tempfile'

# HTTP PUT verb creates an item on the file. Returns HTTP Code 422 if already exists.
put '/items/:key' do
  unless File.readlines("file.out").grep(/#{params['key']}/).size != 0 then
    "Created #{params['key']}"
    open('file.out', 'a') { |f|
      f.puts "#{params['key']}\n"
    }
  else
    status 422
  end
end

# HTTP GET verb to retrieve a specific item. Returns 404 if item is not present.
get '/items/:key' do
  unless File.readlines("file.out").grep(/#{params['key']}/).size == 0 then
    "#{params['key']} exists in file!"
  else
    status 404
  end
end

# HTTP GET verb to retrieve all the items.
get '/items' do
  file = File.open("file.out")
  contents = ""
  file.each {|line|
            contents << line
  }
  "#{contents}"
end

# HTTP DELETE verb to remove a specific item. With a horrendous hack to recreate the file without the specific key. Returns 404 if the item is not present.
delete '/items/:key' do |k|
  unless File.readlines("file.out").grep(/#{params['key']}/).size == 0 then
    tmp = Tempfile.new("extract")
    open('file.out', 'r').each { |l| tmp << l unless l.chomp ==  params['key'] }
    tmp.close
    FileUtils.mv(tmp.path, 'file.out')
    "#{params['key']} deleted!"
  else
    status 404
  end
end

So I’m sure there will be a lot of Ruby experts criticizing how that code is written, to be honest, and taking into account is the first piece of Ruby code I wrote, and that I lifted some examples from everywhere, I’m quite happy with my Frankenstein API.

So let’s curl around to see how my API works.

If I PUT an item there, it saves it on the file, but of course if the item exists, it won’t allow me to do it again:

[ncorrare@risa ~]# curl -vX PUT http://localhost:4567/items/item2
* Hostname was NOT found in DNS cache
*   Trying ::1...
* connect to ::1 port 4567 failed: Connection refused
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 4567 (#0)
> PUT /items/item2 HTTP/1.1
[...]
>
< HTTP/1.1 200 OK
< Content-Type: text/html;charset=utf-8
[...]
* Connection #0 to host localhost left intact

[ncorrare@risa ~]# curl -vX PUT http://localhost:4567/items/item2
* Hostname was NOT found in DNS cache
*   Trying ::1...
* connect to ::1 port 4567 failed: Connection refused
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 4567 (#0)
> PUT /items/item2 HTTP/1.1
[...]
>
< HTTP/1.1 422 Unprocessable Entity
[...]
<
* Connection #0 to host localhost left intact
[ncorrare@risa ~]#

Note that when I try to PUT the item the second time, it returns an HTTP code of 422. By the way, I spent a bit of time researching which should be the right http code to return in this case. I think I got it right but if anyone has a table documenting how to map these, please do send it over.

By the way, depending on the sinatra / webrick version, you may need to specify a Content-Length (even if its zero) or you might get an HTTP/1.1 411 Length Required response. It would look like this

/usr/bin/curl -H 'Content-Length: 0' -fIX PUT http://localhost:4567/items/item1

Now how about a couple of GETs

[ncorrare@risa ~]# curl -vX GET http://localhost:4567/items
* Hostname was NOT found in DNS cache
*   Trying ::1...
* connect to ::1 port 4567 failed: Connection refused
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 4567 (#0)
> GET /items HTTP/1.1
[...]
>
< HTTP/1.1 200 OK
< Content-Type: text/html;charset=utf-8
[...]
<
item1
item2
* Connection #0 to host localhost left intact
[ncorrare@risa ~]# curl -vX GET http://localhost:4567/items/item2
* Hostname was NOT found in DNS cache
*   Trying ::1...
* connect to ::1 port 4567 failed: Connection refused
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 4567 (#0)
> GET /items/item2 HTTP/1.1
[...]
>
< HTTP/1.1 200 OK
< Content-Type: text/html;charset=utf-8
[...]
* Connection #0 to host localhost left intact
item2 exists in file!
[ncorrare@risa ~]# curl -vX GET http://localhost:4567/items/item3
* Hostname was NOT found in DNS cache
*   Trying ::1...
* connect to ::1 port 4567 failed: Connection refused
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 4567 (#0)
> GET /items/item3 HTTP/1.1
[...]
< HTTP/1.1 404 Not Found
< Content-Type: text/html;charset=utf-8
[...]
<
* Connection #0 to host localhost left intact

Please note that GET, of course, if the default action, but I’m specifying it just for documentation purposes. A GET to /items is returning the whole contents of the file, while a GET to /items/specificitem, checks if the items (a string in this case) exists on the file. A GET to a non-existent item returns 404.

Finally, let’s delete something.

[ncorrare@risa ~/puppetlabs/demo/razor]# curl -vX DELETE http://localhost:4567/items/item2
* Hostname was NOT found in DNS cache
*   Trying ::1...
* connect to ::1 port 4567 failed: Connection refused
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 4567 (#0)
> DELETE /items/item2 HTTP/1.1
[...]
>
< HTTP/1.1 200 OK
< Content-Type: text/html;charset=utf-8
[...]
<
* Connection #0 to host localhost left intact
item2 deleted!
[ncorrare@risa ~/puppetlabs/demo/razor]# curl -vX DELETE http://localhost:4567/items/item2
* Hostname was NOT found in DNS cache
*   Trying ::1...
* connect to ::1 port 4567 failed: Connection refused
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 4567 (#0)
> DELETE /items/item2 HTTP/1.1
[...]
>
< HTTP/1.1 404 Not Found
< Content-Type: text/html;charset=utf-8
[...]
<
* Connection #0 to host localhost left intact
[ncorrare@risa ~/puppetlabs/demo/razor]#

For all intent and purposes, that’s a great API, it even has error checking!. But the best is yet to come, now is time to actually wrap a Puppet module around it. As I’ll be using exec, I’ve to be spot on, around managing errors. Now here’s the kicker… in order for curl to return an HTTP error code >= 400 as an exit code > 0, you have to use the -f parameter in curl.

So in this case, I basically created a module (which is available along with the full API code in https://github.com/ncorrare/ncorrare-itemsapi). The key here, is the resource definition:

define itemsapi (
  $ensure,
)
  {
    validate_re($ensure, ['present','absent'])
    if $ensure=='present' {
      exec { 'create item' :
        command => "/usr/bin/curl -H 'Content-Length: 0' -fIX PUT http://localhost:4567/items/${name}",
        unless  => "/usr/bin/curl -fIX GET http://localhost:4567/items/${name}",
      }
    }
    elsif $ensure=='absent' {
      exec { 'delete item' :
        command => "/usr/bin/curl -fIX DELETE http://localhost:4567/items/${name}",
        onlyif  => "/usr/bin/curl -fIX GET http://localhost:4567/items/${name}",
      }
    } else {
      fail('ensure parameter must be present or absent')
    }
  }

That’s it! You know have a new type, to manage an API, which is fully idempotent, so you can basically:

itemsapi { "whatever":
  ensure => "present",
}

or:

itemsapi { "whatever":
  ensure => "absent",
}

By the way, can you find the references to Sinatra (the actual one) in the blog post? There are two (I think!)

Razor by example

A long long time ago, in a country far away, I wrote the Network Install guide for Fedora describing how to set up a tftp/dhcp server to PXE Boot machines, with a nice menu and all that. At the time, all the cool kids where using Satellite / Spacewalk, or were looking at server MAC Addresses in their CMDB / Inventory. It was published in 2008, I’m not a cool kid anymore, and it looks really outdated (much like my profile in the Fedora Wiki, including a picture from 11 years and 15 kilos ago).

So it’s time to redeem myself in front of all those poor people that have been maintaning /var/lib/tftpboot directories, and dhcpd.conf files. In comes Razor, with a different approach to things. Razor is both a microkernel and a server side component. A bare metal server (or unprovisioned VM) boots this microkernel, collects all the inventory information, and sends it to the server through an API call.

In turn, the server Razor server will evaluate that inventory information against a set of rules, and will provision the server with a base operating system (Linux, Windows, ESXi), and also configure a set of post install tasks (a.k.a. Brokers) in order to hand off the node to a configuration management tool, such as Puppet, or another one related to some culinary aspect.

There is a lot of documentation available on how to set up a Razor server, and I’m not particularly a fan of reinventing the wheel. Of course you need a DHCP Server, (or if you’re into that, you can burn a boot.iso image, though you’ll still need a DHCP server to provide basic network configuration in order for the target server to reach the Razor Sever).

As for which DHCP server to choose, I’m using one bundled with my Mikrotik router, but in a real world example, you’d probably use one bundled on a proper Firewall, or at least ISC DHCPD. Configurating it is far from rocket science:

  • If your network booting ROM supports iPXE, the filename is ‘bootstrap.ipxe’, there is also an ‘undionly.kpxe’ image (dhcp option 67, filename in ISC DHCPd).
  • Your should set up a tftp server (xinetd, dnsmasq, others) and of course add the parameter to your DHCP server (dhcp option 66, next-server in ISC DHCPd).

Here are a couple of examples lifted from the Razor Server wiki:

# This works for ISC DHCP Server 4.1-ESV-R4 (ubuntu 12.04 LTS default)
subnet 192.168.1.0 netmask 255.255.255.0 {
  range 192.168.1.200 192.168.1.250;
  option subnet-mask 255.255.255.0;
  if exists user-class and option user-class = "iPXE" {
    filename "bootstrap.ipxe";
  } else {
    filename "undionly.kpxe";
  }
  next-server 192.168.1.100;
}
#This is an example for dnsmasq
dhcp-match=IPXEBOOT,175
dhcp-boot=net:IPXEBOOT,bootstrap.ipxe
dhcp-boot=undionly.kpxe
# TFTP setup
enable-tftp
tftp-root=/var/lib/tftpboot

The PXE image will in turn chainload either the microkernel, or the kernel/initrd of the appropiate Operating System (or Windows Pre-Exectution Environment).

In order to set up your Razor server, I’d advise you to use the packages from the PuppetLabs repository, available in yum.puppetlabs.com (package is called razor-server). Or better yet, just use Puppet to set it up

puppet module install puppetlabs-razor
puppet apply -t -e 'include razor'

This will set up the Razor Server in the Puppet Master.

Now it’s time to obtain the PXE images. You can get the ‘undionly.kpxe’ file here, and the bootstrap.ipxe file from the razor server itself. Just curl the contents of http://razor:8080/api/microkernel/bootstrap?nic_max=N to a file, but don’t use localhost, since it actually hardcodes the server name. Drop those two in your tftp home directory.

Technically, your infrastructure is ready to start collecting node information, boot a new node from the network, and hopefully, you’ll see this:

Razor Booting Up

That’s razor booting by the way, give it a couple of minutes (about two in my demo environment) and all your facts about that node will be collected.

So when your node is inventoried, you can issue a ‘razor nodes’ command to see which nodes are ready to be provisioned with razor.

[root@centos6a ~]# razor nodes
From https://localhost:8151/api/collections/nodes:

+-------+-------------------+--------------------+--------+----------------+
| name  | dhcp_mac          | tags               | policy | metadata count |
+-------+-------------------+--------------------+--------+----------------+
| node1 | 08:00:27:27:14:20 | virtual, bigvirtual| ---    | 0              |
+-------+-------------------+--------------------+--------+----------------+
| node2 | 08:00:27:46:c3:55 | (none)             | ---    | 0              |
+-------+-------------------+--------------------+--------+----------------+
| node3 | 08:00:27:9e:e7:6a | (none)             | ---    | 0              |
+-------+-------------------+--------------------+--------+----------------+

Query an entry by including its name, e.g. `razor nodes node1`

Since razor is nice enough to tell us to keep digging, well.. lets!

[root@centos6a ~]# razor nodes node1 hw_info
From https://localhost:8151/api/collections/nodes/node1:

  mac: [08-00-27-27-14-20]

Query additional details via: `razor nodes node1 hw_info [mac]`

[root@centos6a ~]# razor nodes node1 facts
From https://localhost:8151/api/collections/nodes/node1:

                 virtual: virtualbox
              is_virtual: true
              interfaces: enp0s3,lo
        ipaddress_enp0s3: 10.20.1.156
       macaddress_enp0s3: 08:00:27:27:14:20
          netmask_enp0s3: 255.255.255.0
              mtu_enp0s3: 1500
            ipaddress_lo: 127.0.0.1
              netmask_lo: 255.0.0.0
[..]
   blockdevice_sr0_model: CD-ROM
            blockdevices: sda,sr0
                     gid: root
           system_uptime:
                            seconds: 124
                              hours: 0
                               days: 0
                             uptime: 0:02 hours
                uniqueid: 007f0100

Query additional details via: `razor nodes node1 facts [dhcp_servers, os, processors, system_uptime]`

Now it’s time to get a repository set up. As an example, let’s create one for Centos 6.6

razor create-repo --name centos-6.6 --task centos --iso-url http://mirror.as29550.net/mirror.centos.org/6.6/isos/x86_64/CentOS-6.6-x86_64-bin-DVD1.iso

As you can imagine, this will take a while. Pop up a new shell, and let’s look at brokers in the meanwhile.

Brokers, are basically the way to handover the server to a post-install configuration management tool, (such as Puppet), and also run some post install tasks.

If you want to create your own broker (to set up DNS records maybe?), you can read the official documentation, and other prebuilt brokers are documented in the source tree.

In order to set up a basic broker to handover the server to Puppet Enterprise, create it in the following way:

razor create-broker --name pe --broker-type puppet-pe --configuration server=puppet-master.example.com

Finally, we have tags and policies. Tags are applied to server based on ‘rules’ that evaluate ‘facts’.

As a quick example, if you were to create a tag for virtual servers, it would probably look somewhat like this:

[root@centos6a ~]# razor tags
From https://localhost:8151/api/collections/tags:

+--------------+--------------------------------------+-------+----------+
| name         | rule                                 | nodes | policies |
+--------------+--------------------------------------+-------+----------+
| virtual      | ["=", ["fact", "is_virtual"], true]  | 2     | 1        |
+--------------+--------------------------------------+-------+----------+

Please take minute here to glance at those amazing tables!. Here’s another example for the official documentation. To create a tag for servers with less than a gig of RAM:

razor create-tag --name small --rule '["<", ["num", ["fact", "memorysize_mb"]], 1024]'

Finally, you need to create a Policy. Bare in mind once a policy it’s created, Razor will start provisioning all your nodes as per that policy, so in a brownfield environment, either try to be as specific as possible, or follow these suggestions.

[root@centos6a ~]# razor policies
From https://localhost:8151/api/collections/policies:

+-------------+----------+-------+--------+---------+---------+---------+------+
|name         |repo      |task   | broker | enabled |max_count| tags    | nodes|
+-------------+----------+-------+--------+---------+---------+---------+------+
|centos-smvirt|centos-6.6|centos | pe     | true    |20       | smlvirt | 1    |
+-------------+----------+-------+--------+---------+---------+---------+------+
|win-bigvir   |win2012r2 |w2012r2| pe     | true    |20       | bigvirt | 1    |
+-------------+----------+-------+--------+---------+---------+---------+------+
|centos-virt  |centos-6.6|centos | pe     | true    |20       | virtual | 0    |
+-------------+----------+-------+--------+---------+---------+---------+------+

Query an entry by including its name, e.g. `razor policies centos-for-smallvirtual`

Taking into account that it took me a couple of days to write this article, I’d say my technological debt with the community is now settled!.

Docker, You're doing it wrong

I tend (as many people) to blame a technology for what people actually do with that technology. But think about it, who’s to blame for Hiroshima, the A-Bomb? Or the guys from the Manhattan Project?

I know, probably way out of context, and comparing Docker with the A-bomb would definitely get me a bit of heat in certain circles. But here’s the thing, this article is not titled “I hate Docker” (must confess the idea crossed my mind), but it’s rather “Docker: You’re doing it wrong”.

Let’s begin by saying that containers is not a new idea, they have been around for ages in Solaris, Linux, AIX among others, and soon enough, in Windows!. We actually have to thank Docker for that one. Docker was definitely innovative about the way they’re packaged, deployed, etc. … It’s a great way to run micro services, you’d download this minimal image with nginx, or node, or {insert your language/toolkit/app server here} and run your workload.

Now here’s the problem with that, a lot people (ejem… developers) will download an image with a stack ready to go, bundle their script, and tell someone from the ops team “Hey go and deploy this, you don’t need to worry about the dependencies”. Now a great ops guy, will answer “That’s great, but:”

  • Where did you get this software from?
  • What’s inside this magic container?
  • Who’s going to patch it?
  • Who’s going to support it?
  • What about security?

And so begins a traditional rant, with phrases like “code monkey” and “cable thrower/hardware lover” that could last for hours debating what’s the best way to do deployments.

Hey, remember? We’re in the DevOps age now, we don’t fight over those meaningless things anymore. We do things right, we share tooling and best practices.

So let’s be a little more pedagogic about Docker from a System Administrator perspective:

  • Over 30% of the images available in the Docker Hub contain vulnerabilities, according to this report, so where did you get this software from is kind of a big thing.
  • Generally, downloading images “just because it’s easier” leads to questions like what’s that container actually running. Granted, it’s you that actually expose the ports open in the container, but what if some weird binary decides to call home?
  • If you’re about to tell me that your containers are really stateless, and you don’t patch and that in CI you create and destroy machines and you don’t need to patch, I’ve two words for you, ‘Heartbleed much?’. Try to tell your CIO, or your Tech Manager you don’t need to patch.
  • Is your image based on an Enterprise distribution? How far has it been tuned? Who do I call when it starts crashing randomly.
  • XSA-108 anyone? Remember cloud reboots? Even a mature technology that has been on the market for years is prone to have security issues, so there is no guarantee that something weird running in your container might actually hit a 0-day and start reading memory segments that don’t belong to it.

Vulnerabilities in Docker Images, extracted from http://www.infoq.com/news/2015/05/Docker-Image-Vulnerabilities

Now I don’t pretend to discourage you from actually using Docker, again, it’s a great tool if you use it wisely. But build your own images!. Understand what you’re putting into them. Make it part of your development cycle to actually update them and re-release them. Use it as it was designed, i.e. stateless!.

I know that this might cause flame wars and you are desperately looking for a comments box to express your anger on how I’ve been trashing Docker.

  • First, I haven’t!. Docker is a great tool.
  • Second, if you want to correct me on any of the points above, there is enough points to contact me, and I’m happy to rectify my views.
  • Third, feel free to call me a caveman on your own blog, but progress shouldn’t compromise security!.