Wednesday, October 15, 2014

Deploying Apache Virtual Hosts using Puppet on CentOS 6

http://funwithlinux.net/2014/10/deploying-apache-virtual-hosts-using-puppet-on-centos-6

Scaling a website to serve thousands or even tens of thousands of users simultaneously is a challenge often best tackled by horizontal scaling – distributing workloads across dozens or even hundreds of servers. As a tool for preparing servers for that task, Puppet offers low deployment costs, ease of use, and automated configuration management.
After a successful deployment of a new hardware farm, how can you assure a static configuration across your entire environment? Puppet addresses that problem. Let’s see how to install Puppet and use it to deploy, as an example, an Apache web server virtual host on CentOS 6. This tutorial shows how to deploy virtual hosts on only one server, but the same steps can be replicated to manage many servers. I’ll assume you’re familiar with the Linux command line, basic networking concepts, and using Apache.

Install the Puppet Master

First, some terminology. The Puppet Master is the node in charge of reviewing the state and configuration of Puppet agents on managed nodes. For our purposes, one CentOS server, pup1, will be the Puppet Master, while another, pup2, will be a node.
Pre-compiled Puppet packages are available for most popular distros. We’ll use the open source version of Puppet though packages available in the EPEL repository[, which should be set up in your sources.lst file]. The commands to set up the Puppet repos for x86_64 and i386 architectures are
sudo rpm -ivh https://yum.puppetlabs.com/el/6/products/x86_64/puppetlabs-release-6-7.noarch.rpm
and
sudo rpm -ivh https://yum.puppetlabs.com/el/6/products/i386/puppetlabs-release-6-7.noarch.rpm
Once the repository is set up, you can install the puppet-server package, which enables a server to be a Puppet Master, with the command
yum install puppet-server. This will install both the Puppet Master service (puppetmaster) as well as the Puppet client agent and service (puppet)
Puppet Masters need to be able to receive incoming TCP connections on port 8140, so open that port in your firewall:
1
iptables -I INPUT -p tcp --dport 8140 -j ACCEPT && service iptables save
All of the servers need to know which server is the Puppet Master, including the Master itself. Add the following line to /etc/hosts:
1
127.0.0.1  puppet
To test that Puppet is running correctly, create the file /etc/puppet/manifests/site.pp:
1
2
3
4
5
6
7
#site.pp
file {'testfile':
      path    => '/tmp/testfile',
      ensure  => present,
      mode    => 0640,
      content => "I'm a test file.",
}
The above file is important; it’s the default manifest file for all of the Puppet nodes. This file contains defines all resources that will be managed by the Puppet Master. Currently, the manifest is very simple, just one static file resource, similar to a ‘hello world’ program. The first line defines the resource type, and the name of the resource as it will be referenced by Puppet. Path is the fully qualified pathname of the file, as CentOS will refer to it. The ensure attribute tells Puppet the file’s requested state, in this case, present. Mode are the Linux file system permissions, and content is what will actually be inside this file.
Edit /etc/puppet/puppet.conf and add server = pup1 in the [agent] section. This tells the Puppet Master service that the hostname of the server that the service is running on is pup1
Now start Puppet and the Puppet Master service with the commands service puppet start && service puppetmaster start.
To configure the services to start automatically, run chkconfig puppet on && chkconfig puppetmaster on.
If everything went well, you should see a /tmp/testfile that Puppet created as defined by the site manifest file.

Install the Puppet agent on nodes

Now that the Puppet Master is working properly, you can install the Puppet agents on other nodes. On pup2, install the Puppet repo using the same command as above, and the puppet package with yum install puppet.
Add the IP address of the Puppet Master server to the /etc/hosts on each node, along with the names “puppet” and “pup1.” In this example, our Puppet Master server is on the local LAN IP 192.168.1.1. Please edit for your configuration:
192.168.1.1  puppet pup1
Edit /etc/puppet/puppet.conf and add server = pup1 in the [agent] section, then start the Puppet agent service and add it to the list of services to start at boot time: service puppet start && chkconfig puppet on.
After the Puppet agent starts for the first time, you need to approve a certificate-signing request on the host so the new agent can talk to the server. On pup1, run puppet cert --list. That command should display the hostname “pup2″ with a certificate string. Run puppet cert sign --all to sign all pending cert requests.
Now restart the agent on pup2. If everything went well, you should see /tmp/testfile created on the node, just as it was on the master server.

Create a Puppet module

Typically, Puppet agents receive their instructions from modules, which contain the manifests you wish to apply to a host or group of hosts.
For this example, create a module called webserver. First create a directory structure in the correct path: mkdir -p /etc/puppet/modules/webserver/manifests. Then create the init.pp manifest file in that directory:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
#/etc/puppet/modules/webserver/manifests/init.pp
class webserver {
    package { 'httpd':
        ensure => installed,
    }
    file { 'www1.conf':
        path => '/etc/httpd/conf.d/www1.conf',
        ensure => file,
        require => [Package['httpd'], File['www1.index']],
        source => "puppet:///modules/webserver/www1.conf",
    }
    file { 'www1.index':
        path => '/var/www/index.html',
        ensure => file,
        source => "puppet:///modules/webserver/index1.html",
        require => Package['httpd'],
    }
    file { 'www2.index':
        path => '/var/www2/index.html',
        ensure => file,
        require => File['www2.docroot'],
        source => "puppet:///modules/webserver/index2.html",
        seltype => 'httpd_sys_content_t',
    }
    file { 'www2.conf':
        path => '/etc/httpd/conf.d/www2.conf',
        ensure => file,
        require => [Package['httpd'], File['www2.index']],
        source => "puppet:///modules/webserver/www2.conf",
    }
    file { 'www2.docroot':
        path => '/var/www2',
        ensure => directory,
        seltype => 'httpd_sys_content_t',
    }
    service { 'httpd':
        name => 'httpd',
        ensure => running,
        enable => true,
        subscribe => [File['www1.conf'], File['www2.conf']],
    }
    service { 'iptables':
        name => 'iptables',
        ensure => running,
        enable => true,
        subscribe => File['iptables.conf'],
    }
    file { 'iptables.conf':
        path => '/etc/sysconfig/iptables',
        ensure => file,
        source => "puppet:///modules/webserver/iptables.conf",
    }
 
}
In this manifest, we defined a class ‘webserver’. This class defines numerous resources within, and can be referenced by external manifests using this class name. Some new resource types have been introduced since our example site.pp manifest file, which are covered here.
The first new resource type is ‘package’. A package resource is referred to directly by the name of the package in available repositories. Since Apache2 Webserver is referred to as httpd in the CentOS repositories, that is the name that must be used here. The ensure attribute instructs Puppet to ensure that the package is installed.
The next resource is file ‘www1.conf’. This file’s source is defined in an external file, which is part of the webserver module. This file and its contents will be described later. This file resource also has the ‘require’ attribute. The order in which Puppet manifests are applied, and the resources defined therein, is not guaranteed. Often times, in order for software to work properly, dependencies must be met. You must manually declare dependencies using the require attribute for each resource that depends on packages, files, or folders. The file ‘www1.conf’ depends on the package ‘httpd’ as well as the file ‘www1.index’, which has not been defined yet. The ensure attribute will instruct Puppet to make those other resources available before processing file ‘www1.index’.
The final new resource type is ‘service’. The name of the service refers directly to the system service installed by a package. The service ‘httpd’ is the Apache2 Webserver in CentOS, and thus is what is referenced by Puppet. The attribute ‘enable’ refers to the service starting at system boot, and the ‘ensure’ attribute tells Puppet the run state of the service. A special attribute for the service type is ‘subscribe’. This instructs Puppet to restart the service whenever the files www1.conf or www2.conf change their contents.
Puppet manifests require careful planning. For instance, you don’t want to create a www1.conf file until the httpd package is installed because the necessary /etc/httpd/ directories will not have been created yet. Trying to create a file in a nonexistent directory will cause Puppet to fail. Therefor, we ‘require’ the httpd package be installed before creating the www1.conf file. Puppet will also fail if it encounters any circular dependencies, so be sure your resources are planned logically.
Since we have declared a number of dependencies which include .conf files, we need to create the files themselves to be included in the module so the Puppet agent can write them to the local file system of the nodes. Place these files in /etc/puppet/modules/webserver/files/ on the Puppet Master, pup1:
index1.html:
1
2
3
4
5
<html>
<body>
I'm in /var/www
</body>
</html>
index2.html:
1
2
3
4
5
<html>
<body>
I'm in /var/www2
</body>
</html>
iptables.conf:
1
2
3
4
5
6
7
8
9
10
11
12
13
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [50:4712]
-A INPUT -p tcp -m tcp --dport 80 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 8140 -j ACCEPT
-A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A INPUT -p icmp -j ACCEPT
-A INPUT -i lo -j ACCEPT
-A INPUT -p tcp -m state --state NEW -m tcp --dport 22 -j ACCEPT
-A INPUT -j REJECT --reject-with icmp-host-prohibited
-A FORWARD -j REJECT --reject-with icmp-host-prohibited
COMMIT
www1.conf:
1
2
3
4
5
NameVirtualHost *:80
<VirtualHost *:80>
  DocumentRoot /var/www
  ServerName www2.example.com
</VirtualHost>
www2.conf:
1
2
3
4
5
NameVirtualHost *:80
<VirtualHost *:80>
  DocumentRoot /var/www2
  ServerName www2.example.com
</VirtualHost>
That’s everything you need for this Puppet module. You now need to include it in the site manifest file. To apply this module to all registered nodes, add an include statement to the manifest file. Since it is unlikely that every server in an environment will have the exact same configuration applied to it, you can utilize the node statement to apply a given module only to a given system. If you have more than one server, you can use a regular expression in place of the string for host names instead of listing them one by one. Important: If you define one node in the site’s manifest file, then you must define either all nodes, or define a default node statement for any remaining hosts.
1
2
3
4
5
6
7
8
9
10
#site.pp
file {'testfile':
      path    => '/tmp/testfile',
      ensure  => present,
      mode    => 0640,
      content => "I'm a test file.",
}
node 'pup2' {
      include webserver
}
The Puppet Agent on pup2 will automatically apply the new configuration (also referred to as the catalog) during the next runinterval, which is how often the Puppet Agent applies the catalog; this is every 30 minutes by default and is configurable in your node’s puppet.conf file. You can also apply the catalog immediately and see any debugging output by running puppet agent --test.
After applying the new catalog, you should be able to view both of your new sites on pup2 by editing your workstation’s host file to include the new domains mapped to pup2′s IP address. If you were deploying these new nodes in a production setting, the (many) nodes would most likely be behind a software or hardware load balancer (such as HAProxy, Amazon Elastic Load Balancing, or an F5), with DNS entries pointing to the Virtual IP of the load balancer.
This brief tutorial should give you an idea of how easy it is to get up and running with Puppet. While we created a simple VirtualHost by hand, Puppet also has a huge variety of plugins available to further refine control of VirtualHosts by Puppet directly.

No comments:

Post a Comment