Provisioning VMware Workstation Machines from Artifactory with Vagrant

I wrote a small
Vagrantfile
and helper library for provisioning VMware VMs from boxes hosted on Artifactory. I put this together with the intent of helping us easily provision our Rancher/Cattle/Docker-based platform wholesale on our machines to test changes before pushing them up.

Here it is: https://github.com/carlosonunez/vagrant_vmware_artifactory_example

Tests are to be added soon! I’m thinking Cucumber integration tests with unit tests on the helper methods and Vagrantfile correctness.

I also tried to emphasize small, isolated and easily readable methods with short call chains and zero side effects.

The pipeline would look roughly like this:

  • Clone repo containing our Terraform configurations, cookbooks and this Vagrantfile
  • Make changes
  • Do unit tests (syntax, linting, coverage, etc)
  • Integrate by spinning up a mock Rancher/Cattle/whatever environment with Vagrant
  • Run integration tests (do lb’s work, are services reachable, etc)
  • Vagrant destroy for teardown
  • Terraform apply to push changes to production

We haven’t gotten this far yet, but this Vagrantfile is a good starting point.

Some Terraform gotchas.

So you’ve got a bacon delivery service repository with Terraform configuration files at the ready, and it looks something like this:


$> tree
.
├── main.tf
├── providers.tf
└── variables.tf

0 directories, 3 files

terraform is applying your configurations and saving them in tfstate like you’d expect. Awesome.

Eventually, your infrastructure scales just large enough to necessitate a directory structure. You want to express your Terraform configurations in a way that (a) makes it easy to see what’s in which environment, (b) makes it easy to modify those environments without affecting other environments and (c) prevents your HCL from becoming a total mess not much unlike if you were to do it with Puppet or Chef.

Fortunately, Terraform makes this pretty easy to do…but not without some gotchas.

<

h2>One suggestion: Use modules!

Modules give you the ability to reuse Terraform resources throughout your codebase. This way, instead of having a bunch of aws_instances lying around in your main, you can neatly express them in ways that make more sense:


module "sandbox-web-servers" {
  source = "../modules/aws/sandbox"
  provider = "aws.us-west-1"
  environment = "sandbox"
  tier = "web"
  count = 10
}

When you do this, you need to populate Terraform’s module cache by using terraform get /path/to/module.

<

h2>Gotcha #1: Self variable interpolation isn’t a thing yet.

If you noticed, the example above references “sandbox” quite a lot. This is because, unfortunately, Terraform modules (and resources, I believe) do not yet support self-referencing variables. What I mean is this:


module "sandbox-web-server" {
  environment = "sandbox"
  source = "../modules/${var.self.environment}"
  ...
}

Given that everything in Terraform is a directed graph, the complexity in doing this makes sense. How do you resolve a reference to a variable that hasn’t been defined yet?

This was tracked here, but it looks like a blue-sky feature right now.

Gotcha #2: Module source paths are relative to the module.

Let’s say you had a module definition that looked like this:


module "sandbox-web-servers" {
  source = "modules/aws/sandbox"
}

and a directory structure that looked like this:


$> tree
.
├── infrastructure
│   └── sandbox
│       └── web_servers.tf
└── modules
    └── aws
        └── sandbox
            └── main.tf

5 directories, 2 files

Upon running terraform apply, you’d get an awesome error saying that modules/aws/sandbox couldn’t be located, even if you ran it at the root. You’d wonder why this is given that Terraform is supposed to reference everything from the location from which the application was executed.

It turns out that modules don’t work that way. When modules are loaded with terraform get, their dependencies are sourced from the location of the module. I haven’t looked too deeply into this, but this is likely due to the way in which Terraform populates its graphs.

To fix this, you’ll need to either (a) create symlinks in all of your modules pointing to your module source, or (b) fix your sources to use relative paths relative to the location of the module, like this:


module "sandbox-web-servers" {
  source "../../modules/aws/sandbox"
  ...
}

Gotcha #3: Providers must co-exist with your infrastructure!

This one took me a few hours to reason about. Let’s go back to the directory structure referenced above (which I’ve included again below for your convenience):


$> tree
.
├── infrastructure
│   └── sandbox
│       └── web_servers.tf
└── modules
    └── aws
        └── sandbox
            └── main.tf

5 directories, 2 files

Since you deploy to multiple different sources (nit pick: Nearly every example I’ve seen on Terraform assumes you’re using AWS!), you want to create a providers folder to express this. Additionally, since your infrastructure might be defined differently by environment and you want the thing that’s actually calling terraform to assume as little about your infrastructure as possible, you want to break it down by environment. When I tried this, it looked like this:


.
├── infrastructure
│   └── sandbox
│       └── web_servers.tf
├── modules
│   └── aws
│       └── sandbox
│           └── main.tf
└── providers
    ├── openstack
    ├── colos
    ├── gce
    └── aws
        ├── dev
        │   ├── main.tf
        │   └── variables.tf
        ├── pre-prod
        │   ├── main.tf
        │   └── variables.tf
        ├── prod
        │   ├── main.tf
        │   └── variables.tf
        └── sandbox
            ├── main.tf
            └── variables.tf

14 directories, 10 files

You now want to reference this in your modules:


# infrastructure/sandbox/aws_web_servers.tf
module "sandbox-web-servers" {
  source = "../../modules/aws/sandbox"
  provider = "aws.sandbox.us-west-1" # using a provider alias
  ...
}

and are in for a pleasant surprise when you discover that Terraform fails because it can’t locate the “aws.sandbox.us-west-1” provider.

I initially assumed that when Terraform looked for the nearest provider, it would search the entire directory for a suitable one, in other words, it would follow a search path like this:


- ./infrastructure/sandbox
- ./infrastructure
- .
- ./modules
- ./modules/aws
- ./modules/aws/sandbox
- .
- ./providers
- ./providers/aws
- ./providers/aws/sandbox <-- here

But that’s not what happens. Instead, it looks for its providers in the same location as the module being referenced. This meant that I had to put providers.tf in the same place as aws_web_servers.tf.

I couldn’t even get away with putting it in the directory for its requisite environment above it (i.e. ./infrastructure/aws/sandbox) because Terraform doesn’t currently support object inheritance.

Instead of re-defining my providers in every directory, I created my providers.tf in every infrastructure environment folder I had (which is just sandbox at the moment) and symlinked it in every folder underneath it. In other words:


carlosonunez@DESKTOP-DSKP2VT:/tmp/terraform$ ln -s ../providers.tf infrastructure/sandbox/aws/providers.tf^C
carlosonunez@DESKTOP-DSKP2VT:/tmp/terraform$ ls -lart infrastructure/sandbox/aws/
total 0
-rw-rw-rw- 1 carlosonunez carlosonunez  0 Dec  6 23:52 web_servers.tf
drwxrwxrwx 2 carlosonunez carlosonunez  0 Dec  7 00:14 ..
drwxrwxrwx 2 carlosonunez carlosonunez  0 Dec  7 00:14 .
lrwxrwxrwx 1 carlosonunez carlosonunez 15 Dec  7 00:14 providers.tf -> ../providers.tf
carlosonunez@DESKTOP-DSKP2VT:/tmp/terraform$ tree
.
├── infrastructure
│   └── sandbox
│       ├── aws
│       │   ├── providers.tf -> ../providers.tf
│       │   └── web_servers.tf
│       └── providers.tf
├── modules
│   └── aws
│       └── sandbox
│           └── main.tf
└── providers
    ├── aws
    ├── colos
    ├── gce
    └── openstack
        ├── dev
        │   ├── main.tf
        │   └── variables.tf
        ├── pre-prod
        │   ├── main.tf
        │   └── variables.tf
        ├── prod
        │   ├── main.tf
        │   └── variables.tf
        └── sandbox
            ├── main.tf
            └── variables.tf

15 directories, 12 files

It’s not great, but it’s a lot better than re-defining my providers everywhere.

Gotcha #4: Unset your provider env vars!

So the thing in Gotcha #3 never happened to you. It seemed to deploy just fine. That is until you realized you were deploying to the production account instead of the dev, which you were abruptly informed of by Finance when they were wondering why you spun up $15,000 worth of compute. Oops.

This is because of a thoughtful-yet-conveniently-unfortunate side effect of providers whereby (a) most of them support using environment variables to define their behavior, and (b) Terraform has no way of turning this off (an issue I recently raised).

For now, unset boto, openstack, gcloud or whatever provider CLI tool you might be using before running terraform commands. That, or run them in a clean shell using /bin/sh

That’s it!

I’m really enjoying Terraform. I hope you are too! Do you have any other gotchas? Want to leave some feedback? Throw in a comment below!

About Me

20160408

I’m a DevOps consultant for ThoughtWorks, a software company striving for engineering excellence and a better world for our next generation of thinkers and leaders. I love everything DevOps, Windows, and Powershell, along with a bit of burgers, beer and plenty of travel. I’m on twitter @easiestnameever and LinkedIn at @carlosindfw.

Making sense of this ChatOps thing

So I’m still not entirely sold on the urgency or importance of “chatops.”

I’m a huge fan of Google Assistant neé Now. I wish that I could replace Siri with it daily. It can answer nearly any question you throw at it, and it is smart enough to do contextual things that resemble conversations. For fun, I just asked Siri to navigate me to my favorite winery from Lewisville, TX to Grapevine, TX, Messina Hof while away. Here’s what it came back with:

siri-fail

Not very useful. What’s a Messina?

Google Assistant, on the other hand, knows what’s up…kind of:

google-win

It didn’t get me to the Grapevine location my fiancée and I always go to, but it (a) knew I was talking about Messina Hof, and (b) navigated me to their biggest vineyard in Bryan, TX (a.k.a Aggieland, opinions notwithstanding).

Here’s the thing, though: in almost every case, I will probably open Google Maps and search for the location there. I’m sure that, in the near future, Assistant will be knowledgable enough to know the exact location I want and whether I should stop for gas and a coffee on the way there (Google’s awesome new phone will probably help accelerate that). In the present, however, it’s a lot faster to do all of that from the app.

Which kind of explains my issue with chatops.

What’s ChatOps?

PagerDuty (awesome on-call management app, highly recommend) explains that, holistically, chatops:

…is all about conversation-driven development. By bringing your tools into your conversations and using a chat bot modified to work with key plugins and scripts, teams can automate tasks and collaborate, working better, cheaper and faster.

Since this is DevOps and that definition wouldn’t be complete without referring to tooling of some sort, remember this?

aol_bots

Think that, but with your infrastructure, more Slack, more modern Web and fewer early 2000s nostalgia:

original

The overall goal of chatops is to use communication mediums that we take advantage of on a daily basis to manage workflows and infrastructure more seamlessly. (To me, email automation would not only squarely fit in with this design pedagogy, but, as discussed later, would also probably be the most compatible and far-reaching solution for people.)

I’m not saying ChatOps isn’t awesome.

There are several frameworks out there that enable companies and teams to start playing around. Hubot, by Github, is the most well-known one. It works with just about every messaging platform out there, including Lync if you have a XMPP gateway set up. Slack integrations and webhooks are also very popular for companies using that product. When implemented correctly, chatops can be quite powerful.

Being able to say phrases like /deploybot deploy master of <project> to preprod or /beachbot create a sandbox environment for myawesometool from carlosnunez’s fork on Slack or Jabber and action on them would be incredibly neat, not to mention incredibly fast. This can be immensely valuable in several high-touch situations such as troubleshooting unexpected issues with infrastructure or automating product releases from a common tool.

More mature implementations can go much, much deeper than that.

44-livingston-blog-post-image-20160601174608

I listened to an extremely interesting episode of Planet Money recently that explained an interesting period of growth for Subaru in the late 1990s to early 2000s. Subaru was struggling to compete with booming Japanese automakers at the time. They were producing cheaper cars faster and were successful in aggressively targetting the mid-market that Subaru classically did well in. Growth eventually went negative, and morales plummeted with it.

In the late 1990s, they made a discovery while trying to find a modicum of success with what they currently had. They discovered that out of their entire lineup of products, only one was selling consistently: the Impreza. They sought to find out why.

What they found was surprising. They saw that this car, and only this car, had a strong positive correlation with female buyers, specifically females that lived together. So they, with the help of Mulryan/Nash, their ad agency, tried something rash: they aimed to exclusively target homosexual couples in almost all of their ad campaigns.

Their sales soared. In fact, they were the only auto manufacturer to generate revenue during the 2008 Global Financial Crisis.

(Check out the full story here if you’re interested in learning more!)

Wouldn’t it have been awesome if they had bots that scoured sales demographics data from their network of dealerships and turn the identified trends covered within into emails or chats that marketing or sales managers can parse and make these same decisions on? How much faster do you think they would have been able to identify this and action on it? How many other trends could they have uncovered and made potential sales on?

That’s what I think when I hear about ChatOps. But let’s get back to reality.

I’m saying that it’s just not that crucial.

There are a lot of things that have to be done “right” before chatops can work. Monitoring and alerting have to be on point, especially for implementing things like automated alert or alarm bots. Creating new development environments have to be automated or at least have a consistent process from which automation can occur. Configuration management has to exist and has to be consistent for deployment bots to work. The list goes on.

Here in lies the rub: for engineers, accomplishing these things from a command-line tool is just as simple, and developers and engineers tend to spend just as much time with their tools as their IM client. Furthermore, implementing new systems introduces complexity, so introducing chatops to an organization when their tooling needs improvement will usually lead to my Messina-that-isn’t-Messina Hof situation from before where the quality of both toolsets ultimately suffers. So if the goal of implementing chatops is to make engineering’s life easier (or to make it easier for non-technical people to gain more understandable views into their tech), there might be easier and more important wins to be had first.

It’s not the end-all-be-all…yet.

Financial companies, tech-friendly law firms and news organizations use chatops to help model the state of markets, find trends in big law to identify new opportunities and uncover breaking news to broadcast around the world. The intrinsic value of ChatOps is definitely apparent.

That said, the foundation of the house comes first. Infrastructure, process and culture have to be solid and at least somewhat automated before chatops can make sense.

About Me

20160408

I’m a DevOps consultant for ThoughtWorks, a software company striving for engineering excellence and a better world for our next generation of thinkers and leaders. I love everything DevOps, Windows, and Powershell, along with a bit of burgers, beer and plenty of travel. I’m on twitter @easiestnameever and LinkedIn at @carlosindfw.

Driving technical change isn’t always technical

Paperful office

Locked rooms full of potential secrets was nothing new for a multinational enterprise that a colleague of mine consulted for a few years ago. A new employee stumbling upon one of these rooms, however, was.

What that employee found in his accidental discovery was a bit unusual: a room full of boxes, all of which were full of neatly-filed printouts of what seemed like meeting minutes. Curious about his new find, he asked his coworkers if they knew anyting about this room.

None did.

It took him weeks to find the one person that had a clue about this mysterious room. According to her, one team was asked to summarize their updates every week, and every week, someone printed them out, shipped it to the papers-to-the-metaphoric-ceiling room and categorized it.

Seems strange? This fresh employee thought so. He sought to find out why.

After a few weeks of semi-serious digging, he excavated the history behind this process. Many, many years ago (I’m talking about bring-your-family-into-security-at-the-airport days), an executive was on his way to a far-away meeting and remembered along the way that he forgot to bring a summary of updates for an important team that was to come up in discussion. Panicked, he asked his executive assistant to print it out and bring it to him post haste. She did.

To prevent this from happening again, she printed and filed this update out every week in the room that eventually became the paper jungle gym. She trained her replacement to do this, her replacement trained her replacement; I think you see where this is headed. The convenience eventually became a “rule,” and because we tend to be conformant in social situations, this rule was never contested.

None of those printed updates in that room were ever used.


This has nothing to do with DevOps.

Keep reading.

I’m not sure of what became of that rule (and neither does my colleague). There is one thing I’m sure of, though: tens of thousands of long-lived companies of all sizes have processes like these. Perhaps your company’s deployments to production depend on an approval from some business unit that’s no longer involved with the frontend. Perhaps your company requires a thorough and tedious approval process for new software regardless of its triviality or use. Perhaps your team’s laptops and workstations are locked down as much as a business analyst who only uses their computers for Excel, Word and PowerPoint. (It’s incredible what they can do. Excel itself is a damn operating system; it even includes its own memory manager.)

Some of the simplest technology changes you can make to help your company go faster to market don’t involve technology at all. If you notice a rule or process that doesn’t make sense, it might be worth your while to do your own digging and question it. More people might agree with you than you think.

About Me

I’m a DevOps consultant for ThoughtWorks, a software company striving for engineering excellence and a better world for our next generation of thinkers and leaders. I love everything DevOps, Windows, and Powershell, along with a bit of burgers, beer and plenty of travel. I’m on twitter @easiestnameever and LinkedIn at @carlosindfw.

Config management and cloud provisioning: There be dragons

So I’ve tried using configuration management to deploy infrastructure to two different clouds and learned this: whenever you think “it would be great if we could deploy to EC2 with Chef,” use CloudFormation or Terraform instead.

Why? Here are a few reasons that come to mind:

  • CloudFormation/Terraform is easier. Terraform YAML is nicer than CloudFormation JSON, but both are *way* easier than trying to shoehorn Jinja2 (Ansible) or chef-provisioning Ruby to do what you want. Like, hundreds of lines easier.

    I once tried to use Ansible to automate provisioning of Active Directory forests onto EC2. I had to create my own roles for handling AMI selection, security group CRUD operations, EBS provisioning, etc. The 2000+ lines of YAML I wrote to uphold all that bass ultimately became about 200 lines of ugly, yet functional, CloudFormation JSON.

    Yeah.
  • Built-in rollback is awesome. CloudFormation and Terraform both support some kind of rollback. Chef provisioning does as well with the :rollback action (I don’t think Ansible does; at least it didn’t when I used the EC2 plugin), but it’s not guaranteed.
  • I really liked the CloudFormation API. I haven’t tried Terraform’s CLI yet, but I would imagine that it’s just as awesome. aws cloudformation provides a lot of useful information that’s easy to action upon in a Chef recipe or Ansible play, especially given that both platforms have support for CloudFormation “built-in.” What’s better, the AWS SDKs have full support for CloudFormation as well, which means…
  • You’re not locked into anything. This was the biggest takeaway from my experiences using chef-provisioning or ansible-ec2. If you ever decide to move away from Chef or Ansible, you’ll need to port over your deployment code with it. Depending on the platform, this could take anywhere from hours to weeks.

    Not a problem with CloudFormation or Terraform. Perhaps you’ll need to change how your Chef shell resource behaves, but that’s a lot easier to deal with, in my opinion.

Using your config management solution to do it all is really attractive. It’s usually not a bad idea either. However, when it comes to cloud, tread carefully!

About Me

Carlos Nunez is a DevOps consultant for ThoughtWorks, a software company striving for engineering excellence and a better world for our next generation of thinkers and leaders. He loves everything DevOps, Windows, and Powershell, along with a bit of burgers, beer and plenty of travel.

Follow him on Twitter! @easiestnameever.

Start small; move fast

Seinfeld wasn’t always the heavily-syndicated network cash cow it is today. The hit show started as an experiment for Jerry and Larry David. They wanted to write a show to describe the life of a comedian in New York, namely, Jerry’s. Despite Jerry’s limited acting and writing experience, they wrote their pilot in the late 1980’s and sold it as “The Jerry Chronicles,” which NBC made its first national appearance of on July 1989.

I’ll spare you the details, but eventually the crew found their beat and, shortly afterwards, historic levels of success. but I will say this: every episode of Seinfeld was based off of, and written by, a personal story from someone on its writing staff. Compared to the sitcom-by-committee shows that prevailed during the time, this was a small, but drastic, change that eventually made its way into the mainstream. (For example, every cast member on The Office, a favorite of mine, wrote their own episode; some more than once.)

Moving fast; not as fast as you might think

I don’t know much else about sitcoms, but I do know this: DevOps is chock-full of hype that’s very easy to get lost in. Super-fast 15 minute standups across teams that magically get things done. Lightweight Python or Ruby apps that somehow manage to converge thousands of servers to relentless uniformity. Everything about the cloud. Immutable infrastructure that wipes instead of updates. It’s very tempting to want to go fast in a world full of slow, but doing so without really thinking about it can lead to fracturing, confusion and, ironically, even more slowness.

Configuration management is a pertinent example of this. Before the days of Chef, Puppet or even CFEngine, most enterprises depended on huge, complex configuration management databases (CMDBs), ad-hoc scripts and mountains of paperwork, documentation and physical run-books to manage their “estate” or “fleet.” It was very easy for CFOs to justify the installation and maintenance of these systems: audits were expensive, violating the rules that audits usually exposed was even more expensive, and the insanely-complex CMDBs that required leagues of consultants to provision were cheap in comparison.

Many of these money-rich companies are still using these systems to manage their many thousands of servers and devices. Additionally, many of them also have intricate and possibly stifling processes for introducing new software (think: six months, at minimum, to install something like Sublime Text). Introducing Chef to the organization without a plan sounds awesome in theory but can easily lead to non-trivial amounts of sadness in reality.

The anatomy of the status quo

There are many reasons behind why I think this is, at least from what I’ve noticed during my time at large orgs. Here are the top two that I’ve observed with more frequency:

  • People fear/avoid things that they don’t understand. HufPo ran an article about this in 2011. They found that most people feel more comfortable with things that have been around longer than those that haven’t. The same goes for much of what goes on at work. New things means new processes, new training, and new complexities.
  • Some things actually exist for a reason.
  • Many people using change management tools for the first time deride them to being useless formalities of yesteryear when systems were mainframes and engineers required slide rules. However, much of their value actually stems from complying to and being flexible with similarly-complicated regulations to which those companies are beholden. Consequently, trying to replace all of that with JIRA, while not impossible, will be an incredibly-epic uphill battle.

Slow is smooth; smooth is fast.

Now, I’m not saying all of this to say that imposing change in the enterprise is impossible. Nordstrom, for instance, went from a stolid retail corporation to a purveyor of open source tech. NCR, GE and other corporate Goliaths that you might recognize are doing the same.

What I am saying, however, is to do something like what Jerry Seinfeld did: start small, and start lean. If you’ve been itching to bring Ansible to your company in a big way, perhaps it might be worthwhile to tap into the company’s next wonder-child investment and use it for a small section of the project. Passionate about replacing scp scripts with Github? It might be worthwhile to find a prominent project that’s using this approach and implement it for them. (Concessions are actually a very powerful way of introducing change when done right. In fact, doing favors for people is an old sales trick, as experiments have shown that people feel beholden to other people that do favors for them.

Finding a pain point, acting on it in a smart way and failing fast are the principal tenets of doing things the “lean” way, and you don’t even need to create your own LLC to do it! In fact, to me, this is what DevOps is really about: using technology in smart ways to get business done by getting everyone on the same page.

About Me

Carlos Nunez is a DevOps consultant for ThoughtWorks, a software company striving for engineering excellence and a better world for our next generation of thinkers and leaders. He loves everything DevOps, Windows, and Powershell, along with a bit of burgers, beer and plenty of travel.

Follow him on Twitter! @easiestnameever.

Winning at Ansible: How to manipulate items in a list!

The Problem

Ansible is a great configuration management platform with a very, very extensible language for expressing yoru infrastructure as code. It works really well for common workflows (deploying files, adding authorized_keys, creating new EC2 instances, etc), but its limitations become readily apparent as you begin embarking in more custom and complex plays.

Here’s a quick example. Let’s say you have a playbook that uses a variable (or var in Ansible-speak) that contains a list of tables, like this:

important_files:
- file_name: ssh_config
file_path: /usr/shared/ssh_keys
file_purpose: Shared SSH config for all mapped users.
- file_name: bash_profile
file_path: /usr/shared/bash_profile
file_purpose: Shared .bash_profile for all mapped users.

(You probably wouldn’t manage files in Ansible this way, as it already comes with a fleshed-out module for doing things with files; I just wanted to pick something that was easy to work with for this post.)

If you wanted to get a list of file_names from this var, you can do so pretty easily with set_fact and map:

- name: "Get file_names."
set_fact:
file_names: "{{ important_files | map(attribute='file_name') }}"

This should return:

[ u'/usr/shared/ssh_keys', u'/usr/shared/bash_profile' ]

However, what if you wanted to modify every file path to add some sort of identifier, like this:

[ u'/usr/shared/ssh_keys_12345', u'/usr/shared/bash_profile_12345' ]

The answer isn’t as clear. One of the top answers for this approach suggested extending upon the map Jinja2 filter to make this happen, but (a) I’m too lazy for that, and (b) I don’t want to depend on code that might not be on an actual production Ansible management host.

The solution

It turns out that the solution for this is more straightforward than it seems:

- name: "Set file suffix"
set_fact:
file_suffix: "12345"

- name: &quot;Get and modify file_names.&quot;
set_fact:
file_names: "{{ important_files | map(attribute='file_name') | list | map('regex_replace','(.*)','\\1_{{ file_suffix }}') | list }}"

Let’s break this down and explain why (I think) this works:

  • map(attribute='file_name') selects items in the list whose key matches the attribute given.
  • list casts the generated data structure back into a list (I’ll explain this below)
  • map('regex_replace','$1','$2') replaces every string in the list with the pattern given. This is what actually does what you want.
  • list casts the results back down to a list again.

The thing that’s important to note about this (and the thing that had me hung up on this for a while) is that every call to map (or most other Jinja2 filters) returns the raw Python objects, NOT the objects that they point to!

What this means is that if you did this:

- name: "Set file suffix"
set_fact:
file_suffix: "12345"

- name: "Get and modify file_names."
set_fact:
file_names: "{{ important_files | map(attribute='file_name') | map('regex_replace','(.*)','\\1_{{ file_suffix }}') }}"

You might not get what you were expecting:

ok: [localhost] => {
    "msg": "Test - <generator object do_map at 0x7f9c15982e10>."
}

This is sort-of, kind-of explained in this bug post, but it’s not very well documented.

Conclusion

This is the first of a few blog posts on my experiences of using and failing at Ansible in real life. I hope that these save someone a few hours!

About Me

Carlos Nunez is a site reliability engineer for Namely, a modern take on human capital management, benefits and payroll. He loves bikes, brews and all things Windows DevOps and occasionally helps companies plan and execute their technology strategies.

BYOD Part 1: Computers In The Cloud

Computing is expensive. Desktops and laptops cost lots of money. Printers cost even more money. (Printers are really funny, actually; buying one or two isn’t so bad, but once you’re managing tens or hundreds or more laser printers and printing hundreds or thousands of pages per day, the cost of toner/ink and repair skyrocket like a SpaceX shuttle.) Desks cost even more money. Accessories cost even more money. The list goes on and on,infinitum ad nauseum.

Do you like saving money and hate fixing broken computers? Read on.

Now that we live in an age where downloading high-def movies takes less time than starting up your car, leveraging the cloud and having people bring in their own devices has become a highly lucrative alternative. The bring-your-own-device, or BYOD, movement has picked up a lot of steam over the years, so much so that Gartner expects for “half of the world’s companies” to enact it. Over a billion devices are expected to be using BYOD by 2018, and as more and larger companies begin to take advantage of cloud computing, this trend will only accelerate.

I’ll spend the next three posts talking about three key components of most BYOD environment:

  1. Virtual desktops,
  2. Laptops and desktops, and
  3. Mobile phones and tablets

I’ll explain who the major players involved with each component are, their importance in BYOD and some things to watch out for during considerations.

WIth that, let’s start by talking about computers in the cloud.

Computing. Computing Everywhere.

Bring your own stuff

Most bring-your-own-device setups will need a virtual desktop infrastructure, or VDI for short. Without going too deep into the details (and I’ll scratch the surface of this on this week’s Technical Thursdays post), virtual desktops give you computers in the cloud that can be used from anywhere on nearly anything, even phones and tablets.

A VDI is almost always comprised of:

  1. One or more servers on which these virtual machines will be hosted, which are also known as virtual machine hosts or hypervisors,
  2. Management software to start, stop, create, delete and report on machines, and
  3. Software installed on the virtual machines that make the experience more seamless and accessible.

This means that you’ll need the following to get started:

  1. A subscription to a Cloud service like Amazon EC2 or Microsoft Azure or
  2. Your own server(s) in house for hosting virtual machines (most computers made after 2007 should support this with no issues), and
  3. Enough disk space to host the number of machines you’d like to test (1TB is a good starting point),
  4. A virtual machine hypervisor like VMware ESX, Microsoft HyperV (comes with Windows 2008 R2 and up) or Xen, and
  5. A trial version of Citrix XenDesktop, VMware View, Proxmox (free) or SCVMM.

The Players

There are three major players in this space that offer all of the above with varying amounts of complexity:

  1. Citrix (XenDesktop + NetScaler, a load balancer that works really well with VDI),
  2. VMware (VMWare View), and
  3. Microsoft (HyperV + Systems Center Virtual Machine Manager 2012, usually called SCVMM).

Free and open-source solutions also exist, but they might need more love and attention depending on your situation. We’ll go into that a bit later on in this post.

The Upsides

VDI has a number of advantages aside from being a critical component of going full BYOD:

  1. The desktop is replaceable. Jim’s computer broke again? With VDI, you can get him up and running in minutes instead of hours since the desktop itself is a commodity and nothing of importance gets stored on it.

  2. Decreased hardware costs. Depending on your situation, virtual desktops make it possible to order $300 computers in bulk that can do what $2000+ computers can’t.

  3. Increased data security. Over 30 BILLION DOLLARS of valuable data and IP are lost every year due to stolen laptops and devices. Virtual desktops are configured by default to keep ALL of your data in your datacenter and your profits in your bank accounts.

  4. Your desktop is everywhere. Ever wished your team could move around within minutes instead of days? Ever wished to use cheap Chromebooks to access your desktop at work? Virtual desktops make this (and more) possible.

If you’re interested and like pretty charts, here’s a cost savings white paper published by Citrix and Gartner that go into these advantages in more detail. But we all know that every rose has thorns, and VDI is no exception. In fact, if done improperly, VDI can introduce more problems than it solves.

VDI Is A Pay To Play Sport

Yeah...you'll still need to plan.

VDI is kind-of like a new car. If you find the right one for you and take care of it, you’ll likely enjoy it significantly more than getting that used Ferrari you thought was “affordable.” (Hint: they never are.)

Deploying computers in the cloud correctly can range from “free” (but expensive on time and labor) to ridiculously expensive depending on how couplex your infrastructure will be. Here is a list of factors that determine this complexity:

  1. Number of machines Much like their physical counterpart, managing VDIs gets increasingly complicated as you add more machines into the mix. However, unlike physical desktops, replacing broken machines or upgrading slow ones can be done with a few mouse clicks. Some setups even allow users to upgrade their own machines on the spot in seconds!
  2. Network bandwidth Virtual desktops are heavily dependent on the quality of the network on which they operate. The less bandwidth they have available to them, the more tweaking you’ll need to do to make people not hate you for taking away their machines.
  3. Your company’s workload Virtual machines on a host share computing resources with each other. Hosts will usually do everything possible to prevent one machine from hogging up resources from other machines (though this can be overridden), which means that the more intensive your use case is, the less likely VDI will work for you without significant tweaking. That said, virtual desktops work well for a wide set of use cases. (Some of Citrix’s clients use virtual desktops to do CAD and heavy graphics rendering, which most people would normally pass on VDI for.)
  4. Remote workers. Users on laptops that travel a lot will often have unpredictable network conditions. While the frameworks mentioned above handle this situation really nicely, it’s important to take this into account early on in your due diligence.

There are also many little hidden costs that can turn into money pits very easily if not taken into consideration early on in the process, such as:

  1. Will you engage Citrix, VMware or third-party consulting services to help you get started, or will you or one of your engineers go solo? (Here’s a hint: Citrix and VMware will always upsell their consulting services.)

  2. Does your company use or require VPN? (The answer is usually “yes,” but most of the products mentioned above support using desktops over plain Internet.)

  3. How many users will get a virtual desktop? How many of them will actually use it? How will they use it?

  4. You’ll always need more storage than you think you do.

  5. Does your company operate under regulatory requirements?

There’s a very easy way to come upon the answers to these questions, and it’s actually a lot easier than you think.

Just do it…as a test

Building a proof-of-concept VDI is pretty straightforward in most cases. You or your admin can probably set one up in an hour or two. Building this and adding users slowly will guide you towards the answers to these questions and help you understand whether VDI is right for your company or group. More importantly, it is much easier to build VDI automation when your VDI is small than when it’s already a massive behemoth that can’t be shut down at any cost. (Why is this important? Want to roll out 10,000 virtual desktops within minutes or automatically create and remove desktops based on server conditions? You’ll need automation to do this and much more.)

Here’s a tutorial on how to set this up with Citrix XenDesktop. Here’s another tutorial for View.

Have fun!

About Me

I’m the founder of caranna.works, an IT engineering firm in Brooklyn that builds smarter and cost-effective IT solutions that help new and growing companies grow fast. Sign up for your free consultation to find out how. http://caranna.works.

If Your Business Still Uses Servers, You’re (Probably) Doing It Wrong

Your servers are useless, and you should sell them.

Many businesses small and large buy servers for many wrong reasons. Some businesses want a server for an application they wrote. Some others want to keep their data “private.” Others still want servers for “better speed.”
All of these reasons are wrong. There are only three reasons that I can think of that justify the purchase of physical servers (feel free to list more in the comments!):

  1. A regulator your business is beholden to requires it,
  2. Your app really does need that kind of performance (read on to find out if this is you), and
  3. You have a strong passion for burning money.


You see, when you buy servers from Dell or the like, you’re not *just* buying servers. Servers come with a ton of overhead that’s hard to see coming if you don’t buy them often enough:

  1. You’ll need to buy a support plan for when those servers decide to go on vacation during your business hours (which they will), or you pay people like me to support them (which I’m happy to do! http://caranna.works for a free consultation!),
  2. Servers need to be stored in a cool place that isn’t too dusty, and, more importantly, they need to be kept cool if you get several of them.
  3. Servers need A LOT of power (though they use less power than they used to), and ideally that power is clean (which most office buildings have, which is good)



The Cloud is not a fad.

A lot of people make fun of “the cloud,” and rightfully so; drinking games have been made out of keynotes that abused the word endlessly. Debauchery aside, “the cloud” as we know it is, from a 35,000 foot view, a collective of servers that themselves host hundreds of virtualized servers of varying sizes created by millions of people and companies. (Curious about virtualization? Keep an eye out for my post “Yes, you can have a computer in your computer” coming out tomorrow!). Instead of buying a server from Dell or HP and worrying about the above, you create a virtual server on a cloud, do what you need to do and pay for the time, storage and network bandwidth that you use.
Servers in a cloud usually cost between $0.02/hr for really basic machines to over $2/hr for really, really fast workhorses with tons of memory. What’s more incredible than these incredibly-generously prices is what you get with your purchase:

  • Your servers are backed up and “copied” between many other servers in the same region (nearly every cloud service has datacenters spread out across the world), which nearly guarantees that it will always be available when you need it,
  • 24/7 monitoring of nearly anything you can think of,
  • Programming libraries that make it extremely easy for your developers to create new servers in minutes instead of days,
  • Extremely fast networking that you never need to worry about or take care of, and
  • Handfuls of additional services that save you a LOT of time and money, like
    • Create databases for your app or business that are instantly available 24/7,
    • Create web services for hosting your apps on that can handle one user or 10 million users with ease, or
    • Create clusters of extremely fast storage for things like photos and videos that will nearly always be available



The Cloud Saves You Money

To drive the point home, let’s run through a real-life example of a use case where the cloud might be an appropriate fit.
Let’s say that you run a small individual accounting firm. Your six accountants are dependent on QuickBooks, TurboTax, Office and Windows. Business is doing well and you’d like to plan for an upcoming expansion.
In most cases, this will require putting all of the machines behind Active Directory (it is significantly difficult to manage individual Windows machines without it), putting your printer(s) behind a print server and putting your TurboTax and QuickBooks customer files on some kind of storage that’s easy for everyone to access.
To do everything in house, you’ll need:

  • One machine to serve as a domain controller and key management (license) server for new Windows installations,
  • One machine to serve as the print server (you could use the domain controller as the print server, but this will cause problems later down the road), and
  • Two cheap (but not too cheap) network-accessed storage (NAS) devices for that shared storage (one for backup)

To do this, you should plan on spending, at minimum:

  • $1500 for a Dell PowerEdge R220 (which will host the domain controller and your print server) +
  • $200 for a switch to connect those servers and your machines to (your $50 Linksys will not cut it for your expansion) +
  • $600 for one Windows Server 2012 standard license (which will cover the server and the two virtual machines hosted on top of it) +
  • $800 for the two NAS devices =
  • $3100 total + power costs


This doesn’t factor the costs of email or computers; we’ll assume that the computers are sunk costs and you’re already paying for Google Apps or Office 365.
This may not be a lot depending on how well your business is doing, but let’s compare the cost of doing this on Microsoft Azure or Amazon Web Services:

  • $30/month for the domain controller (assuming an A3 instance, which should be enough for a domain controller and a few hundred machines in a single site) +
  • $15/month for the print server (assuming an A1 instance, since print servers don’t require +
  • $25/month for 1TB cloud storage +
  • $400 for one NAS device =
  • $70/month ($840/year) + $400 one-time cost

(Prices for resources on Amazon Web Services are similar.)

Moving this business into the cloud will not only save them hundreds of dollars per month in power costs, but will also save them thousands of dollars per year in hardware repair and depreciation costs! Another good thing about cloud services is that they are all pay-as-you-go; if you ever decide that cloud isn’t for you, you can cancel whenever you want with no early termination fees.

Trying It Out Is Risk-Free

Microsoft and Google give new users $200 and $300 in credits to try their services out with no limitations. Amazon offers a year-long free trial, but only for their most basic service level (which I’ve found inadequate for all but the most basic workloads). All of them are great, and getting started on any of them is pretty easy.

Try Azure here: https://azure.microsoft.com/en-us/pricing/free-trial/
Try AWS here: http://aws.amazon.com/free/
Try Google Cloud Platform here: https://cloud.google.com/free-trial/index

What was your physical to cloud transition story? Is there anything holding you back from trying the cloud? Leave a comment below!