My slides from today's presentation at #PGConfUS 2017 in Jersey City.
Showing posts with label docker. Show all posts
Showing posts with label docker. Show all posts
Thursday, March 30, 2017
PGConf 2017 - PostgreSQL High Availability in a Containerized World
My slides from today's presentation at #PGConfUS 2017 in Jersey City.
Wednesday, November 16, 2016
PostgreSQL High Availability in a Containerized World - PGConfSV 2016
My slides from todays talk at PgConf SV 2016
 
Thursday, June 30, 2016
Hello Docker on Windows
NOTE: 9/8/2016: This is an older post which I wrote few  months ago but never not posted.
After using docker on Linux for more than a year it was finally time to try it on a different platform. Trying on docker on Windows Server 2016 TP4 was one way to try it out but the experience of that was bit more complicated. However when I heard about docker on Windows 10 I was initially surprised. Why? Well based on what I had seen and figured out that it really needed Hyper-V features to run which I assumed was only available on the Windows Server line.
I guess I was wrong. Using Control Panel -> Program & Features -> Turn Windows Features On or Off , there is a feature called Hyper-V which can be turned on.
Now before you start searching for it and trying to turn it on wait till you read the following to save you some hassles.
1. You need Windows 10 Pro (Sorry Windows 10 does not work)
2. You need a CPU which supports Virtualization and SLAT aka EPT.
With Task Manager -> Performance -> CPU it is easy to figure out if Virtualization is supported or not. But SLAT is another story. systeminfo or coreinfo is required to figure that out. You may be able to turn on some of the components of the Hyper-V on CPUs not supporting SLAT but that will not be enough.
I really had to cycle through few laptops using Intel Core2 Duo and Intel Pentium chips which do support Virtualization but did not support SLAT and finally came across my dusty desktop using AMD Phenom which had Virtualization with SLAT support on it. and running Windows 10 on it.
Of course then I applied for the Docker beta program on Windows. The invitation came yesterday and finally got a chance to download the docker binaries and install it.
Once the installation (as Administrator of course) finished it gave the option to Launch docker and after it finished launching the daemon in the background it showed a splash image as follows:
Good job Docker on the usability to show me what to do next:
Next I deploy an nginx server as follows
Woha!! If it did not strike you.. I am running Linux images here on Windows!!
Now I can access the same in a browser as http://docker/
(This I would say was a bit of struggle since I had not read the doc properly where I was trying with http://127.0.0.1/ or http://localhost or http://LOCAL/ but only http://docker worked)
Overall very interesting and game changing for development on Windows!.
After using docker on Linux for more than a year it was finally time to try it on a different platform. Trying on docker on Windows Server 2016 TP4 was one way to try it out but the experience of that was bit more complicated. However when I heard about docker on Windows 10 I was initially surprised. Why? Well based on what I had seen and figured out that it really needed Hyper-V features to run which I assumed was only available on the Windows Server line.
I guess I was wrong. Using Control Panel -> Program & Features -> Turn Windows Features On or Off , there is a feature called Hyper-V which can be turned on.
Now before you start searching for it and trying to turn it on wait till you read the following to save you some hassles.
1. You need Windows 10 Pro (Sorry Windows 10 does not work)
2. You need a CPU which supports Virtualization and SLAT aka EPT.
With Task Manager -> Performance -> CPU it is easy to figure out if Virtualization is supported or not. But SLAT is another story. systeminfo or coreinfo is required to figure that out. You may be able to turn on some of the components of the Hyper-V on CPUs not supporting SLAT but that will not be enough.
I really had to cycle through few laptops using Intel Core2 Duo and Intel Pentium chips which do support Virtualization but did not support SLAT and finally came across my dusty desktop using AMD Phenom which had Virtualization with SLAT support on it. and running Windows 10 on it.
Of course then I applied for the Docker beta program on Windows. The invitation came yesterday and finally got a chance to download the docker binaries and install it.
Once the installation (as Administrator of course) finished it gave the option to Launch docker and after it finished launching the daemon in the background it showed a splash image as follows:
Good job Docker on the usability to show me what to do next:
Next I deploy an nginx server as follows
Woha!! If it did not strike you.. I am running Linux images here on Windows!!
Now I can access the same in a browser as http://docker/
(This I would say was a bit of struggle since I had not read the doc properly where I was trying with http://127.0.0.1/ or http://localhost or http://LOCAL/ but only http://docker worked)
Overall very interesting and game changing for development on Windows!.
Wednesday, January 06, 2016
PostgreSQL and Linux Containers: #SouthbayPUG Presentation
It was a great to talk about Linux Containers tonight at Southbay PostgreSQL User Group at Pivotal.
The slides are now posted online:
 
The slides are now posted online:
Wednesday, September 30, 2015
Mirror Mirror on the wall, Where's the data? In my Vol
When I first started working with docker last year, there was a clear pattern already out there the docker image itself only consists of application binary (and depending on the philosophy - the entire OS libraries that are required) and all application data goes in a volume.
Also the concept called "Data Container" also seemed to be little popular at that time. Not everyone bought into that philosophy and there were various other patterns emerging out then on how people used volumes with their docker containers.
One of the emerging pattern was (or still is) "Data Initialization if it does not exist" during container startup.
Let's face it, when we first start a docker container consisting of say PostgreSQL 9.4 database the volume is an empty file system. We then do an initdb and setup a database so that it is ready to serve.
The simplest way is to check if the data directory has data in it and if it does not have data, then run initdb and setup the most common best practices of the database and serve it up.
Where's the simplest place to do this? In the entrypoint script of docker container of course.
I did the same mistake in my jkshah/postgres:9.4 image too. In fact I still see that same pattern in the official postgres docker image also where it looks for PG_VERSION and if it does not exists then it runs initdb.
if [ ! -s "$PGDATA/PG_VERSION" ]; then
gosu postgres initdb
...
fi
This certainly has advantages:
1. Very simple to code the script.
2. Great Out of the box experience - You start the container up - the container sets itself up and it is ready to use.
Lets look what happens next in real life enterprise usages.
We got in scenarios while the applications using such databases are running but they lost all data in it. Hmm what's going wrong here? The application is working fine, the database is working fine, but all data is like it was freshly deployed and not something that was running well for 3-5 months.
Let's look at various activities that an enterprise will typically do with such a data volume - file system on the host where PostgreSQL containers are running.
1. The host location of the volume itself will be a mounted file system coming off SAN or some storage device.
2. Enterprise will be backing up that file system on periodic intervals
3. On some cases they will be restoring that file system when required.
4. Sometimes the backend storage may have hiccups. (No ! That does not happen :-) )
In any of the above cases, where a mount fails or mounts a wrong file system or if the restore fails, you could end up with an empty file system for a volume path. (Not all people had checks for this)
Now when you start the PostgreSQL docker container on such a volume you will get a new database fully initialized. Most current automations that I have seen works such that in those cases even the application will fully initialize the database with its own schema and initial data and the application moves on like nothing is wrong here.
In the above case it might seem that the application is working to all probes till a customer tries to login into the setup and find that they do not exist in the system .
For DBAs the anal rule is "No Data" error is better than "Wrong/Lost Data" serviced out of a database (specially PostgreSQL users). For this reason, this particular pattern of database initialization is becoming an ANTI Pattern in my view specially for docker containers. A better approach is to have an entrypoint command specifically to do a setup(initialization) knowingly and then all subsequent starts should be called with another entrypoint command to specifically fail if it does not find the data.
Of course again this is a philosophical view on how it should be handled. I would love to hear what people have to say about this.
Also the concept called "Data Container" also seemed to be little popular at that time. Not everyone bought into that philosophy and there were various other patterns emerging out then on how people used volumes with their docker containers.
One of the emerging pattern was (or still is) "Data Initialization if it does not exist" during container startup.
Let's face it, when we first start a docker container consisting of say PostgreSQL 9.4 database the volume is an empty file system. We then do an initdb and setup a database so that it is ready to serve.
The simplest way is to check if the data directory has data in it and if it does not have data, then run initdb and setup the most common best practices of the database and serve it up.
Where's the simplest place to do this? In the entrypoint script of docker container of course.
I did the same mistake in my jkshah/postgres:9.4 image too. In fact I still see that same pattern in the official postgres docker image also where it looks for PG_VERSION and if it does not exists then it runs initdb.
if [ ! -s "$PGDATA/PG_VERSION" ]; then
gosu postgres initdb
...
fi
This certainly has advantages:
1. Very simple to code the script.
2. Great Out of the box experience - You start the container up - the container sets itself up and it is ready to use.
Lets look what happens next in real life enterprise usages.
We got in scenarios while the applications using such databases are running but they lost all data in it. Hmm what's going wrong here? The application is working fine, the database is working fine, but all data is like it was freshly deployed and not something that was running well for 3-5 months.
Let's look at various activities that an enterprise will typically do with such a data volume - file system on the host where PostgreSQL containers are running.
1. The host location of the volume itself will be a mounted file system coming off SAN or some storage device.
2. Enterprise will be backing up that file system on periodic intervals
3. On some cases they will be restoring that file system when required.
4. Sometimes the backend storage may have hiccups. (No ! That does not happen :-) )
In any of the above cases, where a mount fails or mounts a wrong file system or if the restore fails, you could end up with an empty file system for a volume path. (Not all people had checks for this)
Now when you start the PostgreSQL docker container on such a volume you will get a new database fully initialized. Most current automations that I have seen works such that in those cases even the application will fully initialize the database with its own schema and initial data and the application moves on like nothing is wrong here.
In the above case it might seem that the application is working to all probes till a customer tries to login into the setup and find that they do not exist in the system .
For DBAs the anal rule is "No Data" error is better than "Wrong/Lost Data" serviced out of a database (specially PostgreSQL users). For this reason, this particular pattern of database initialization is becoming an ANTI Pattern in my view specially for docker containers. A better approach is to have an entrypoint command specifically to do a setup(initialization) knowingly and then all subsequent starts should be called with another entrypoint command to specifically fail if it does not find the data.
Of course again this is a philosophical view on how it should be handled. I would love to hear what people have to say about this.
Sunday, September 20, 2015
Is it a privilege to run a container in Docker?
Recently while working with various applications in a docker container, we came across few containers that will not run properly unless privileged mode is enabled. The privileged mode gives the container the same rights as host which means it can make changes on host where the container runs. (Huge difference compared to VM - Imagine your VM making changes to the hypervisor directly.)
Of course privileged mode has its uses and I am definitely glad that it is available. However it is not a general purpose option to be used lightly. So imagine my surprise that one of the most common tools that is used in many enterprises now Chef server when running in a docker container also required privileged mode to run. There are various versions available but they all required the mode.
While investigating Chef Server to see why it requires the mode I found it primarily requires it to set some ulimit parameters and a specific kernel parameter inside the container.
sysctl -w kernel.shmmax=17179869184
Now before you say, aha simple lets change the value in the host itself and let the container pick up the value from the host itself.. Let me say been there .. it ain't gonna work. The reason it does not work is due to how Linux namespaces work with CLONEIPC. The net result is everytime a container is created a new namespace of System V IPC is setup with the default shmmax of 32MB. The default will be changed in a later Linux kernel to 4GB but of course like most companies there will not be patience to wait for the Linux kernel to show up let alone a certified Linux distro for production setups.
There are few hacks to work it out as Jerome indicates in a mailing list. But of course none of them was something that was suitable.
Now lets go back to the original command that needed to be executed which required. I have worked with those commands for years always to increase shared memory for databases that uses Sys V style of shared memory like Oracle, PostgreSQL (well till 9.2), etc.
Guess what doing a little digging I did find PostgreSQL used as an embedded database in $CHEF_SERVER_INSTALL/embedded/bin/postgres. Checking the version of "postgres" binary confirmed it to be 9.2.
Checking latest version of Chef server found it to be still using Postgres 9.2. Eventually ended up creating a custom image using Postgres 9.4 and voila got the container running without privileged mode. Thanks Robert Haas.
It also means that as more and more PostgreSQL based containers are being used in containers, it is better to move to the latest version of PostgreSQL for a better experience.
Of course privileged mode has its uses and I am definitely glad that it is available. However it is not a general purpose option to be used lightly. So imagine my surprise that one of the most common tools that is used in many enterprises now Chef server when running in a docker container also required privileged mode to run. There are various versions available but they all required the mode.
While investigating Chef Server to see why it requires the mode I found it primarily requires it to set some ulimit parameters and a specific kernel parameter inside the container.
sysctl -w kernel.shmmax=17179869184
Now before you say, aha simple lets change the value in the host itself and let the container pick up the value from the host itself.. Let me say been there .. it ain't gonna work. The reason it does not work is due to how Linux namespaces work with CLONEIPC. The net result is everytime a container is created a new namespace of System V IPC is setup with the default shmmax of 32MB. The default will be changed in a later Linux kernel to 4GB but of course like most companies there will not be patience to wait for the Linux kernel to show up let alone a certified Linux distro for production setups.
There are few hacks to work it out as Jerome indicates in a mailing list. But of course none of them was something that was suitable.
Now lets go back to the original command that needed to be executed which required. I have worked with those commands for years always to increase shared memory for databases that uses Sys V style of shared memory like Oracle, PostgreSQL (well till 9.2), etc.
Guess what doing a little digging I did find PostgreSQL used as an embedded database in $CHEF_SERVER_INSTALL/embedded/bin/postgres. Checking the version of "postgres" binary confirmed it to be 9.2.
Checking latest version of Chef server found it to be still using Postgres 9.2. Eventually ended up creating a custom image using Postgres 9.4 and voila got the container running without privileged mode. Thanks Robert Haas.
It also means that as more and more PostgreSQL based containers are being used in containers, it is better to move to the latest version of PostgreSQL for a better experience.
Friday, February 20, 2015
CentOS 7, Docker, Postgres and DVDStore kit
Its been a long time since I have posted an entry. It has been a very busy year and more about that in a later post. Finally I had some time to try out new versions of Linux and new OSS technologies.
I started to learn by installing the latest version of CentOS 7. CentOS closely follows RHEL 7 and coming from SLES 11 and older CentOS 6.5, I saw many new changes which are pretty interesting.
New commands to learn immediately as I started navigating:
systemctl
firewall-cmd
I admit that I missed my favorite files in /etc/init.d and looking at new location of /etc/systemd/system/multi-user.target.wants/ will take me a while to get used to.
firewall-cmd actually was more welcome considering how hard I found to remember the exact rule syntax of iptables.
There is new Grub2 but honestly lately I do not even worry about it (which is a good thing). Apart from that I see XFS is the new default file system and LVM now has snapshot support for Ext4 and XFS and many more.
However the biggest draw for me was the support for Linux Containers. As a Sun alumni, I was always draw to the battle of who did containers first and no longer worry about it, but as BSD Jails progressed to Solaris Containers to now the hottest technology: Docker container, it sure has its appeal.
In order to install docker however you need the "Extras" CentOS 7 repository enabled. However docker is being updated faster so the "Extras" repository is getting old at 1.3 with the latest out (as of last week) is Docker 1.5. To get Docker 1.5 you will need to enable "virt7-testing" repository on CentOS 7
I took a shortcut to just create a file /etc/yum.repos.d/virt7-testing.repo with the following contents in it.
 
I started to learn by installing the latest version of CentOS 7. CentOS closely follows RHEL 7 and coming from SLES 11 and older CentOS 6.5, I saw many new changes which are pretty interesting.
New commands to learn immediately as I started navigating:
systemctl
firewall-cmd
I admit that I missed my favorite files in /etc/init.d and looking at new location of /etc/systemd/system/multi-user.target.wants/ will take me a while to get used to.
firewall-cmd actually was more welcome considering how hard I found to remember the exact rule syntax of iptables.
There is new Grub2 but honestly lately I do not even worry about it (which is a good thing). Apart from that I see XFS is the new default file system and LVM now has snapshot support for Ext4 and XFS and many more.
However the biggest draw for me was the support for Linux Containers. As a Sun alumni, I was always draw to the battle of who did containers first and no longer worry about it, but as BSD Jails progressed to Solaris Containers to now the hottest technology: Docker container, it sure has its appeal.
In order to install docker however you need the "Extras" CentOS 7 repository enabled. However docker is being updated faster so the "Extras" repository is getting old at 1.3 with the latest out (as of last week) is Docker 1.5. To get Docker 1.5 you will need to enable "virt7-testing" repository on CentOS 7
[virt7-testing]
name=virt7-testing
baseurl=http://cbs.centos.org/repos/virt7-testing/x86_64/os/
enabled=1
gpgcheck=0
Then I was ready to install docker as follows
# yum install docker
I did find that it actually does not start the daemon immediately, so using the new systemctl command I enabled  and then started the daemon
# systemctl enable docker
# systemctl start docker
We now have the setup ready. However what good is the setup unless you have something to demonstrate quickly. This is where I see Docker winning over other container technology and probably their differentiator. There is an "AppStore" for the container images available to download images. Of course you need a login to access the Docker Hub as it is called at  http://hub.docker.com (which is for free fortunately). 
# docker login
To login to the hub and now you are ready to get new images.
I have uploaded two images for the demonstration for today
1. A Standard Postgres 9.4 image
2. A DVDStore benchmark application image based on kit from http://linux.dell.com/dvdstore/
To download the images is as simple as pull
# docker pull jkshah/postgres:9.4
# docker pull jkshah/dvdstore
Now lets see on how to deploy them. 
For PostgreSQL 9.4 since it is a database it will require storage for "Persistent Data" so first we make a location on the host that can be used for storing the data.
# mkdir /hostpath/pgdata
SELinux is enabled by default on CentOS 7 which means there is an additional step required to make the location read/write from Linux containers
# chcon -Rt svirt_sandbox_file_t /hostpath/pgdata
Now we will create a container as a daemon which will map the container port to host port 5432 and setup a database with a username and password that we set. (Please do not use secret as password :-) )
# docker run -d -p 5432:5432 --name postgres94 -v /hostpath/pgdata:/var/lib/postgresql/data -e POSTGRES_USER=postgres -e POSTGRES_PASSWORD=secret -t jkshah/postgres:9.4
Here now if you check /hostpath/pgdata you will see the database files on the host.
Now lets deploy an application using this database container.
# docker run -d -p 80:80 -–name dvdstore2 -–link postgres94:ds2db –-env DS2DBINIT=1 jkshah/dvdstore
The above command starts another container based on the DVDStore image which expects a database "ds2db" defined which is satisfied using the link option to link the database container created earlier. The application container also intiailizes the database so it is ready to serve requests at port 80 of the host. 
This opens up new avenues to now benchmark your PostgreSQL hardware easily. (Wait the load test driver code is still on Windows  :-( )
Subscribe to:
Comments (Atom)
 
 
