As we transtion to building distributed systems, we need to shed some of the anti-patterns of web application development that we have applied for ages.
The ease of compute access made most of us think that we can process every job interactively, and when things slow down, we just scale by virtue of adding instances. Now when that wasn’t the case, and scaling compute actually meant that a real estate operation was required (AKA “get more room to get more mainframes”) back in the day, we actually had process differentiation. Batch jobs would run on specific time windows when CPU time was available.
Now traditionally, if your application is super complex, you have most likely been through this. You are probably using a message queue or ESB. This post might not be for you. Now if you are still just piling up code in Rails or Node, there are things that you can probably offload as batch (or in my case, parameterized dispatch, but I’ll explain that later). So here I’ve got this library app that I use at home to keep track of my books, and adding a book is, let’s say, a lot of work.
Now from a user perspective, sure I’m going to see the spinning wheel for a while, but from a compute perspective, it is even worse as I’m wasting a thread that could be used for interactive traffic. Then again I could spawn a thread in the system, but the Linux scheduler has a somewhat “local” view of the world as it would be expected.
So how about a scheduler that has a more global view? Weren’t you all using Kubernetes?
Well not me, clearly, but then again this pattern would work with any scheduler.
Nomad has this functionality of dispatching a previously defined job, with an added parameter. So if I wanted to do a “distributed” ping
and I can dispatch it using the CLI like so:
And the result would be something like:
In my case… I’d spawn a Nomad Job upon receiving an API Call:
As you can see there I’m getting a Consul Token that I’m passing to the UI (it lasts for a minute) and I’m getting another Nomad token from Vault (which expires in 5 minutes) to do the request.
Final result looks fantastic
About the author:
Nicolas Corrarello is the Regional Director for Solutions Engineering @ HashiCorp based out of London.
Every modern application has a requirement for encrypting certain amounts of data. The traditional approach has been either relying on some sort of transparent encryption (using something like encryption at rest capabilities in the storage, or column/field level encryption in database systems).
While this clearly minimizes the requirement for encryption within the application, it doesn’t secure the data from attacks like a SQL Injection, or someone just dumping data since their account had excessive privileges, or though exposure of backups.
In comes the requirement of doing encryption at the application level, with of course the expected complexity of doing a right implementation at the code level (choosing the right cyphers, encryption keys, securing the objects), and securing and maintaining the actual encryption keys, which more often than not, end in version control, or in some kind of object store subject to the usual issues.
Traditionally, there has been different approaches to secure encryption keys:
An HSM could be used, with considerable performance penalty.
An external encryption system, like Amazon’s KMS, or Azure Key Vault, or Google KMS’s, where the third party holds your encryption key.
Ultimately, HashiCorp’s Vault offers it’s own Transit backend, which allows (depending on policy) to offload encryption/decryption/hmac workflows, as well as signing and verification, abstracting the complexity around maintaining encryption keys, and allowing users and organizations to retain control over them inside Vault’s cryptographic barrier. As a brief summary of Vault’s capabilities, when it comes to encryption as a service, we can just refer to Vault’s documentation.
The primary use case for transit is to encrypt data from applications while still storing that encrypted data in some primary data store. This relieves the burden of proper encryption/decryption from application developers and pushes the burden onto the operators of Vault. Operators of Vault generally include the security team at an organization, which means they can ensure that data is encrypted/decrypted properly. Additionally, since encrypt/decrypt operations must enter the audit log, any decryption event is recorded.
The objective of this article, however, is not to explain the capabilities of Vault, but rather to do an analysis of what’s the overhead of using Vault to encrypt a single field in the application.
For that purpose, a MySQL data source is used, using the traditional world database (with about 4000 rows), and the test being run will effectively launch a single thread per entry, that will encrypt a field of such entry, and then persist it to Amazon S3.
This test was run on an m4.large instance, in Amazon, running a Vault Server, with a Consul backend, a MySQL server, and the script taking the metrics, all on the same system. It is inspired on a traditional Big Data use case, where information needs to be persisted to some sort of object store for later processing. The commented Python function being executed in each thread is as follows:
defmakeJson(vault,s3,s3_bucket,ID,Name,CountryCode,District,Population):# Take the starting time for the whole functiontot_time=time.time()# Take the starting time for encryptionenc_time=time.time()# Base64 a single ValueNameb64=base64.b64encode(Name.encode('utf-8'))# Encrypt it through VaultNameEnc=vault.write('transit/encrypt/world-transit',plaintext=bytes(Nameb64),context=base64.b64encode('world-transit'))# Calculate how long it took to encrypteetime=time.time()-enc_time# Create the object to persist and convert it to JSONCityobj={"ID":ID,"Name":NameEnc['data']['ciphertext'],"CountryCode":CountryCode,"District":District,"Population":Population}City=json.dumps(Cityobj)filename="%s.json"%ID# Take the starting time for persisting it into S3, for comparisonstore_time=time.time()# Persist the objects3.put_object(Body=City,Bucket=s3_bucket,Key=filename)# Calculate how long it took to store itsstime=time.time()-store_time# Calculate how long it took to run the whole functiontttime=time.time()-tot_timeprint("%i,%s,%s,%s\n"%(int(ID),str(sstime),str(eetime),str(tttime)))
This would render a single line of a CSV, per thread, that later was aggregated in order to analyze the data. The average, minimum and median values are as follows:
As for concurrency, this is running 4 thousand threads that are being instantiated on a for loop. According to this limited dataset (about 4000 entries) we’re looking at a 5% ~ 10% overhead, in regards to execution time.
It’s worth noting that during the tests Vault barely break a sweat, Top reported it was using 15% CPU (against 140% that Python was using).
This was purposely done in a limited scale, as it wasn’t our intention to test how far could Vault go. Vault Enterprise supports Performance replication that would allow it to scale to fit the needs of even the most demanding applications.
As for the development effort, the only complexity added would be adding two statements to encrypt/decrypt the data as the Python example shows. It’s worth noting that Vault supports convergent encryption, causing same encrypted values to return the same string, in case someone would require to look for indexes in WHERE clauses.
Using this pattern, along with the well known secret management features in Vault, would help mitigate against the Top 10 database attacks as documented by the British Chartered Institute for IT:
Excessive privileges:
Using transit, in combination of Vault’s Database Secret Backend, an organization can ensure that each user or application get’s the right level of access to the data, which in its own can be encrypted, requiring further level of privilege to decode it.
Privilege abuse:
Using transit, the data obtained even with the right privileges is encrypted, and potentially requiring the right “context” to decrypt it, even if the user has access to.
Unauthorized privilege elevation:
Much like in the cases above, Vault can determine what is the right access a user gets to a database, effectively terminating the “Gold credentials” pattern and encrypting the underlying data from operator access.
Platform vulnerabilities:
Even if the platform is vulnerable, the data would be secure.
SQL injection:
As data is not transparently encrypted, a vulnerable application would mostly dump obfuscated data, that can be re-wrapped upon detection of a vulnerability to an updated encryption key.
Weak audit:
Vault audits encryption and decryption operations, effectively creating an audit trail which would allow to pinpoint exactly who has access to the data.
Denial of service:
Through Sentinel, our policy as code engine, Vault can evaluate traffic patterns through rules and deny access accordingly.
Database protocol vulnerabilities:
As before, even if the data is dumped, it wouldn’t be transparently decrypted.
Weak authentication:
Using Vault’s Database secret backend, would generate short lived credentials which can be revoked centrally, and have the right level of complexity.
Exposure of backup data:
Backups would be automatically encrypted, just like the underlying data.
About the author:
Nicolas Corrarello is the Regional Director for Solutions Engineering @ HashiCorp based out of London.
The Jenkins credential store in most enterprises is becoming a potential attack vector. It’s generally filled with long lived credentials, sometimes even to production systems.
In comes Hashicorp’s Vault, a Secret Management solution that enables the secure store of secrets, and dynamic generation of credentials for your job. Looks like a great match right? Look at the demo, certainly looks promising (specially with Jenkins beautiful new BlueOcean UI):
Interested? Let’s dive into it:
What is Hashicorp Vault?
Quite simply, is a tool for managing secrets. What’s really innovative about Vault is that it has methods for establishing both user and machine identity (through Auth Backends), so secrets can be consumed programatically. Identity is ultimately established by a (short lived) token. It also generates dynamic secrets on a number of backends, such as Cassandra, MySQL, PostgreSQL, SQL Server, MongoDB, etc. … I’m not going to cover here how to configure Vault, feel free to head out to the Vault documentation. For all intent and purposes of this document, you can download Vault and run it in “dev mode” for a quick test:
Please don’t run a Vault cluster in dev mode on production, feel free to reachout through the usual means if you need help.
AppRole
AppRole is a secure introduction method to establish machine identity. In AppRole, in order for the application to get a token, it would need to login using a Role ID (which is static, and associated with a policy), and a Secret ID (which is dynamic, one time use, and can only be requested by a previously authenticated user/system. In this case, we have two options:
Store the Role IDs in Jenkins
Store the Role ID in the Jenkinsfile of each project
So let’s generate a policy for this role:
In this case, tokens assigned to the java-example policy would have permission to read a secret on the secret/hello path.
Now we have to create a Role that will generate tokens associated with that policy, and retrieve the token:
Note that in this case, the tokens generated through this policy have a time-to-live of 60 minutes. That means that after an hour, that token is expired and can’t be used anymore. If you’re Jenkins jobs are shorted, you can adjust that time to live now to increase security.
Let’s write a secret that our application will consume:
Now Jenkins will need permissions to retrieve Secret IDs for our newly created role. Jenkins shouldn’t be able to access the secret itself, list other Secret IDs, or even the Role ID.
And generate a token for Jenkins to login into Vault. This token should have a relatively large TTL, but will have to be rotated:
In this way we’re minimizing attack vectors:
Jenkins only knows it’s Vault Token (and potentially the Role ID) but doesn’t know the Secret ID, which is generated at pipeline runtime and it’s for one time use only.
The Role ID can be stored in the Jenkinsfile. Without a token and a Secret ID has no use.
The Secret ID is dynamic and one time use only, and only lives for a short period of time while it’s requested and a login process is carried out to obtain a token for the role.
The role token is short lived, and it will be useless once the pipeline finishes. It can even be revoked once you’re finished with your pipeline.
Jenkins pipeline and configuration
A full example for the project is available here. The Jenkinsfile will be using is this one:
Lot’s of stuff here, including certain Maven tasks, but we will be focusing on the Integration Tests stage (the last one), what we’re doing here is:
Downloading the Vault binary (in my case the ARM linux one, I’ll brag about that in a different blog post :D)
Reading credentials from the Jenkins credential store. In this case I’m storing both the ROLE_ID and the VAULT_TOKEN I’ve generated for Jenkins. As mentioned before if you want to split them for more security, you can just use the ROLE_ID as a variable in the Jenkinsfile.
I’m doing a set +x to disable verbosity in the shell in order not to leak credentials, although even with -x, in this case I’m just showing the Secret ID (which is useless after I’ve already used it), and the VAULT_TOKEN that I’m going to use to consume credentials, which is short lived, and can be revoked at the end of this runtime just adding a vault token-revoke command.
I’m retrieving a SECRET_ID using Jenkins administrative token (that I’ve manually generated before, that’s the only one that would be relatively longed lived, but can only generate SECRET_IDs).
I’m doing an AppRole login with the ROLE_ID and the SECRET_ID, and storing that token (short lived).
My Java Process is reading VAULT_TOKEN and VAULT_ADDR to contact Vault and retrieve the secret we stored.
The VAULT_TOKEN (And optionally ROLE_ID) are stored in the Credential store in Jenkins:
Time to extend your Puppet Language skills! The new features for application orchestration introduce a number of language constructs you’ll want to know about if you’re planning to describe your applications and supporting infrastructure in the Puppet language.
It is worth noting that, unlike your traditional Puppet types and providers, an environment service resource type is far simpler to develop, as you’ll see in the first example below.
The environment service resource is the key to defining the relationship between different software layers. It effectively extends a traditional Puppet type (like file, services, or packages) to describe application services (an SQL database, web server, API layer or a microservice, for example). Like Puppet node types, the environment service resource can (optionally) have a provider, which in this case implements a number of service-related methods — for instance, how to check if an environment service produced by an application component is actually ready for work before continuing the deployment.
The environment service type looks pretty much like any other Puppet type, but it has the property of being a capability. A very simple example could be an SQL resource, as described below:
What I’m doing here is telling the Puppet resource abstraction layer (or rather the service abstraction layer) that I have modeled an SQL database that can be produced by a server (managed with Puppet), for use by one or more servers that need an SQL database. You can think of it as a contract between the servers on your network: What information do they need to exchange in order to interoperate?
The Application Layers
The component in your applications should be iterable — that is to say, you should be able to deploy multiple instances of your layers (multiple web servers, for example). What better way to do that than the way you’ve been doing it historically, with defined types?
Here’s where you can leverage your existing modules, profiles and roles. Remember, Puppet is about code reusability; you don’t need to reinvent the wheel every time you do something.
Just to show you a quick example, here’s a define type for my database layer:
As you can see, this is pretty much standard; I’m creating a new type that defines a database. Now going back to the environment service resource I described before, I could potentially produce an SQL resource out of this database, so let’s modify the code to do that:
Now I’ve told the Puppet parser that this particular define type is producing a resource, and I’m mapping the parameters of this type to the parameters of this resource.
Just as this define type produces the resource, I’ve created another defined type, to consume it:
Please note two things in this particular code extract:
I’m declaring variables in the defined type that are going to be filled with parameters from the environment service resource.
I’m also declaring that this defined type consumes the environment service resource, but I’m not mapping any variables. That’s because I’m using the same names for the variables.
The Application Construct
Now let’s glue everything together. The application construct allows us to model the application and declare the dependencies within the layer:
As you can see, it’s here where I’m actually defining the variables to configure the services and create the environment service resource.
The Site Construct
As with your layers, you should be able to deploy your application in multiple instances. That’s where the site construct comes into play. In my particular example, I’m defining it in the site.pp file for the particular environment. But here’s where you actually instantiate your application and glue it to the specific nodes (using the hostnames for the specific nodes):
In all fairness, I had the unfair advantage of using the Puppet language and modules for quite a while now, but as you can see, once again, the code is human readable and draws from the existing modules on the Puppet Forge. As you can imagine, while this is an extremely simple example, the possibilities are endless:
Think of your microservices running in Docker containers — this is a great way of actually tying the whole application together. See an example here.
Load balancing? Sure, your web servers can produce resources that are consumed by the load balancer.
You can easily replace the nodes in your application, assuming you have current backups. You might be able to re-deploy your database and have Puppet automatically update your web servers.
Now that you have an idea on how the language looks like, you should read the full documentation. It’s worth noting that the Puppet Orchestrator, which is the set of tools that will actually make the orderly deployment possible, is not referenced here. But you can learn about it from the documentation linked above and below.
[*] This article was originally authored for the Puppet Labs Blog (http://blog.puppetlabs.com).
I did a short presentation on how to automate the configuration of Arista devices with Puppet on the Arista meetup. Luckily I recorded it, for those of you who weren’t lucky enough to see it live. Video of the screencast is below.
Another week, another commute to a different country. You do have to love Europe!.
Anyway, with an hour on my hands, I starting thinking back to, well, probably the question I get asked the most: “What can you do with Puppet?”.
There is probably no short or straight answer to it, so I tend to start enumerating anything that might be relevant, and lately I’ve been finishing up with, “and also, if you have anything that expose resources through an API, well you can wrap Puppet Code around it!”. An excellent example of this, could be Gareth’s AWS module. It is also, too complicated to illustrate a principle, since it’s a fairly complex module, with quite a bit of Ruby code. For those who don’t know me, I don’t do Ruby… well… kind of.
Ruby does have a number of brilliant gems (see what I did there?), and since I have been hearing about Ruby, the one that always caught my eye, was Sinatra.
It is one of those things that look amazingly simple, that are really powerful, and of course, I always wanted to play around with. With all these in mind, I decided to write an extremely simple REST API, and a Puppet Module around it with very simple curl calls, to demo how this would look like, and of course, I did it my way.
The absolutely basic Sinatra syntax is:
So let’s look at a commented example:
So with a few bits and bolts of Ruby code here and there, here’s my full API example, with three basic verbs.
So I’m sure there will be a lot of Ruby experts criticizing how that code is written, to be honest, and taking into account is the first piece of Ruby code I wrote, and that I lifted some examples from everywhere, I’m quite happy with my Frankenstein API.
So let’s curl around to see how my API works.
If I PUT an item there, it saves it on the file, but of course if the item exists, it won’t allow me to do it again:
Note that when I try to PUT the item the second time, it returns an HTTP code of 422. By the way, I spent a bit of time researching which should be the right http code to return in this case. I think I got it right but if anyone has a table documenting how to map these, please do send it over.
By the way, depending on the sinatra / webrick version, you may need to specify a Content-Length (even if its zero) or you might get an HTTP/1.1 411 Length Required response. It would look like this
Now how about a couple of GETs
Please note that GET, of course, if the default action, but I’m specifying it just for documentation purposes. A GET to /items is returning the whole contents of the file, while a GET to /items/specificitem, checks if the items (a string in this case) exists on the file. A GET to a non-existent item returns 404.
Finally, let’s delete something.
For all intent and purposes, that’s a great API, it even has error checking!.
But the best is yet to come, now is time to actually wrap a Puppet module around it. As I’ll be using exec, I’ve to be spot on, around managing errors. Now here’s the kicker… in order for curl to return an HTTP error code >= 400 as an exit code > 0, you have to use the -f parameter in curl.
So in this case, I basically created a module (which is available along with the full API code in https://github.com/ncorrare/ncorrare-itemsapi). The key here, is the resource definition:
That’s it! You know have a new type, to manage an API, which is fully idempotent, so you can basically:
or:
By the way, can you find the references to Sinatra (the actual one) in the blog post? There are two (I think!)