Wednesday, January 06, 2016

PostgreSQL and Linux Containers: #SouthbayPUG Presentation

It was a great to talk about Linux Containers tonight at Southbay PostgreSQL User Group at Pivotal.
The slides are now posted online:

PostgreSQL and Linux Containers from Jignesh Shah

Wednesday, September 30, 2015

Mirror Mirror on the wall, Where's the data? In my Vol

When I first started working with docker last year, there was a clear pattern already out there the docker image itself only consists of application binary (and depending on the philosophy - the entire OS libraries that are required) and all application data goes in a volume.

Also the concept called "Data Container" also seemed to be little popular at that time. Not everyone bought into that philosophy and there were various other patterns emerging out then on how people used volumes with their docker containers.

One of the emerging pattern was (or still is) "Data Initialization if it does not exist" during container startup.
Let's face it, when we first start a docker container consisting of say PostgreSQL 9.4 database the volume is an empty file system. We then do an initdb and setup a database so that it is ready to serve.
The simplest way is to check if the data directory has data in it and if it does not have data, then run initdb and setup the most common best practices of the database and serve it up.

Where's the simplest place to do this? In the entrypoint script of docker container of course.

I did the same mistake in my jkshah/postgres:9.4 image too. In fact I still see that same pattern in the official postgres docker image also where it looks for PG_VERSION and if it does not exists then it runs initdb.

if [ ! -s "$PGDATA/PG_VERSION" ]; then
gosu postgres initdb
...
fi

This certainly has advantages:
1. Very simple to code the script.
2. Great Out of the box experience - You start the container up - the container sets itself up and it is ready to use.

Lets look what happens next in real life enterprise usages.

We got in scenarios while the applications using such databases are running but they lost all data in it. Hmm what's going wrong here? The application is working fine, the database is working fine, but all data is like it was freshly deployed and not something that was running well for 3-5 months.

Let's look at various activities that an enterprise will typically do with such a data volume - file system on the host where PostgreSQL containers are running.
1. The host location of the volume itself will be a mounted file system coming off SAN or some storage device.
2. Enterprise will be backing up that file system on periodic intervals
3. On some cases they will be restoring that file system when required.
4. Sometimes the backend storage may have hiccups. (No ! That does not happen :-) )

In any of the above cases, where a mount fails or mounts a wrong file system or if the restore fails, you could end up with an empty file system for a volume path. (Not all people had checks for this)

Now when you start the PostgreSQL docker container on such a volume you will get a new database fully initialized. Most current automations that I have seen works such that in those cases even the application will fully initialize the database with its own schema and initial data and the application moves on like nothing is wrong here.

In the above case it might seem that the application is working to all probes till a customer tries to login into the setup and find that they do not exist in the system .

For DBAs the anal rule is "No Data" error is better than "Wrong/Lost Data" serviced out of a database (specially PostgreSQL users). For this reason, this particular pattern of database initialization is becoming an ANTI Pattern in my view specially for docker containers. A better approach is to have an entrypoint command specifically to do a setup(initialization) knowingly and then all subsequent starts should be called with another entrypoint command to specifically fail if it does not find the data.

Of course again this is a philosophical view on how it should be handled. I would love to hear what people have to say about this.

Sunday, September 20, 2015

Is it a privilege to run a container in Docker?

Recently while working with various applications in a docker container, we came across few containers that will not run properly unless privileged mode is enabled. The privileged mode gives the container the same rights as host which means it can make changes on host where the container runs. (Huge difference compared to VM - Imagine your VM making changes to the hypervisor directly.)

Of course privileged mode has its uses and I am definitely glad that it is available. However it is not a general purpose option to be used lightly. So imagine my surprise that one of the most common tools that is used in many enterprises now Chef server when running in a docker container also required privileged mode to run. There are various versions available but they all required the mode.

While investigating Chef Server to see why it requires the mode I found it primarily requires it to set some ulimit parameters and a specific kernel parameter inside the container.

sysctl -w kernel.shmmax=17179869184

Now before you say, aha simple lets change the value in the host itself and let the container pick up the value from the host itself.. Let me say been there .. it ain't gonna work. The reason it does not work is due to how Linux namespaces work with CLONEIPC. The net result is everytime a container is created a new namespace of System V IPC is setup with the default shmmax of 32MB. The default will be changed in a later Linux kernel to 4GB but of course like most companies there will not be patience to wait for the Linux kernel to show up let alone a certified Linux distro for production setups.

There are few hacks to work it out as Jerome indicates in a mailing list. But of course none of them was something that was suitable.

Now lets go back to the original command that needed to be executed which required. I have worked with those commands for years always to increase shared memory for databases that uses Sys V style of shared memory like Oracle, PostgreSQL (well till 9.2), etc.

Guess what doing a little digging I did find PostgreSQL used as an embedded database in $CHEF_SERVER_INSTALL/embedded/bin/postgres. Checking the version of "postgres" binary confirmed it to be 9.2.

Checking latest version of Chef server found it to be still using Postgres 9.2. Eventually ended up creating a custom image using Postgres 9.4 and voila got the container running without privileged mode. Thanks Robert Haas.

It also means that as more and more PostgreSQL based containers are being used in containers, it is better to move to the latest version of PostgreSQL for a better experience.

Friday, February 20, 2015

CentOS 7, Docker, Postgres and DVDStore kit

Its been a long time since I have posted an entry. It has been a very busy year and more about that in a later post. Finally I had some time to try out new versions of Linux and new OSS technologies.

I started to learn by installing the latest version of CentOS 7. CentOS closely follows RHEL 7 and coming from SLES 11 and older CentOS 6.5, I saw many new changes which are pretty interesting.

New commands to learn immediately as I started navigating:
systemctl
firewall-cmd

I admit that I missed my favorite files in /etc/init.d and looking at new location of /etc/systemd/system/multi-user.target.wants/ will take me a while to get used to.

firewall-cmd actually was more welcome considering how hard I found to remember the exact rule syntax of iptables.

There is new Grub2 but honestly lately I do not even worry about it (which is a good thing). Apart from that I see XFS is the new default file system and LVM now has snapshot support for Ext4 and XFS and many more.

However the biggest draw for me was the support for Linux Containers. As a Sun alumni, I was always draw to the battle of who did containers first and no longer worry about it, but as BSD Jails progressed to Solaris Containers to now the hottest technology: Docker container, it sure has its appeal.

In order to install docker however you need the "Extras" CentOS 7 repository enabled. However docker is being updated faster so the "Extras" repository is getting old at 1.3 with the latest out (as of last week) is Docker 1.5. To get Docker 1.5 you will need to enable "virt7-testing" repository on CentOS 7

I took a shortcut to just create a file /etc/yum.repos.d/virt7-testing.repo with the following contents in it.

[virt7-testing]

name=virt7-testing

baseurl=http://cbs.centos.org/repos/virt7-testing/x86_64/os/

enabled=1

gpgcheck=0

Then I was ready to install docker as follows

# yum install docker

I did find that it actually does not start the daemon immediately, so using the new systemctl command I enabled and then started the daemon

# systemctl enable docker

# systemctl start docker

We now have the setup ready. However what good is the setup unless you have something to demonstrate quickly. This is where I see Docker winning over other container technology and probably their differentiator. There is an "AppStore" for the container images available to download images. Of course you need a login to access the Docker Hub as it is called at http://hub.docker.com (which is for free fortunately).

# docker login

To login to the hub and now you are ready to get new images.

I have uploaded two images for the demonstration for today

1. A Standard Postgres 9.4 image

2. A DVDStore benchmark application image based on kit from http://linux.dell.com/dvdstore/

To download the images is as simple as pull

# docker pull jkshah/postgres:9.4

# docker pull jkshah/dvdstore

Now lets see on how to deploy them.

For PostgreSQL 9.4 since it is a database it will require storage for "Persistent Data" so first we make a location on the host that can be used for storing the data.

# mkdir /hostpath/pgdata

SELinux is enabled by default on CentOS 7 which means there is an additional step required to make the location read/write from Linux containers

# chcon -Rt svirt_sandbox_file_t /hostpath/pgdata

Now we will create a container as a daemon which will map the container port to host port 5432 and setup a database with a username and password that we set. (Please do not use secret as password :-) )

# docker run -d -p 5432:5432 --name postgres94 -v /hostpath/pgdata:/var/lib/postgresql/data -e POSTGRES_USER=postgres -e POSTGRES_PASSWORD=secret -t jkshah/postgres:9.4

Here now if you check /hostpath/pgdata you will see the database files on the host.

Now lets deploy an application using this database container.

# docker run -d -p 80:80 -–name dvdstore2 -–link postgres94:ds2db –-env DS2DBINIT=1 jkshah/dvdstore

The above command starts another container based on the DVDStore image which expects a database "ds2db" defined which is satisfied using the link option to link the database container created earlier. The application container also intiailizes the database so it is ready to serve requests at port 80 of the host.

This opens up new avenues to now benchmark your PostgreSQL hardware easily. (Wait the load test driver code is still on Windows :-( )

Friday, November 22, 2013

Feedback from my sesion at PgConfEU

I had recently done a session at PgConf.EU.

It was interesting and encouraging to see the feedback from it.
Thanks to the people who took time to give the feedback.

Session feedback: My experience with embedding PostgreSQL

Speaker: Jignesh Shah

A total of 4 feedback entries have been submitted for this session.

Rankings

Topic importance
Rating	Count
1	0
2	0
3	1
4	0
5	3

(5 is highest, 1 is lowest)

Content quality
Rating	Count
1	0
2	0
3	0
4	0
5	4

(5 is highest, 1 is lowest)

Speaker knowledge
Rating	Count
1	0
2	0
3	0
4	0
5	4

(5 is highest, 1 is lowest)

Speaker quality
Rating	Count
1	0
2	0
3	0
4	1
5	3

(5 is highest, 1 is lowest)

Comments

Very good presentation - IMHO please add Oracle License Costs to your slides. As on the first view - Business people wont understand that they are variable and WAY HIGHER as expected.
One of the best talks this year.

Thursday, October 31, 2013

My experience with embedding PostgreSQL - #pgconfeu

At my current company, we embed PostgreSQL based technologies in various applications shipped as shrink-wrapped software. In this session we talk about the experience of embedding PostgreSQL where it is not directly exposed to end-user and the issues encountered on how they were resolved.
We will talk about business reasons,technical architecture of deployments, upgrades, security processes on how to work with embedded PostgreSQL databases.

Tuesday, September 17, 2013

PostgreSQL replication? I need temp tables for my reports

One of the frequent things I hear that many times PostgreSQL users avoid running PostgreSQL replication because they want to offload reports that use temporary tables on the slaves. Since PostgreSQL replicas are pure read only, it cannot support temporary tables.

There is one way to overcome this with postgres_fdw - PostgreSQL Foreign Data Wrapper which are improved in PostgreSQL 9.3 which is now released.

Create a Master-Slave setup with PostgreSQL 9.3 (with synchronous or asynchronous replication as per your needs).
On the slave setup, setup another PostgreSQL 9.3 instance (with different port) with postgres_fdw and map all tables from slaves as foreign tables with same names as their remote counterparts.
Run reports which requires temporary tables using this new instance

Of course there are few caveats for this setup

Query plans: Currently they are still inefficient but as postgres_fdw improves, this will likely go away. Infact more usage of this use-case scenario will force it to be improved
Lot of data moving: Most DW reports do read lot of rows. However by setting it up on the same server most of it are loopback and dont go on the wire outside.
More Resources: This will do require more memory/cpu on the server but it is still cheaper since the management of such a server is still more simpler compared to other complex designs to achieve the same goal

I would like to hear about your experiences on the same too so feel free to send me comments.

Tuesday, May 21, 2013

How to do Postgres Replication and Failover in VMware vFabric Data Director 2.7?

Last week VMware released vFabric Data Director 2.7. Among the many new features for various database, I wish to give a little more insight into my favorite ones which are regarding Postgres.

One of the big feature add from a broad perspective is support of Postgres 9.2 based managed database servers along with replication. Lets look at how it is done in brief.

First vFabric Postgres 9.2 base DBVM needs to be uploaded into Data Director system resource pool and then converted into template. Note this template is different from vFabric Postgres 9.2 Virtual appliance and is available on the same location where Data Director 2.7 download exists. Also this DBVM template is based on the latest PostgreSQL 9.2.4 core.

Next once the template is loaded into the system resource pool it will show up in Base DBVMs section in System->Manage and Monor-> Templates-> BaseDBVMs

Then you would then right click on the base DBVM and select “Convert to Base DB Template”. Here you also have a new feature to add more disks (think Tablespaces in PostgreSQL) to the template. In Data Director 2.7, disks are added at the template level.

This process take some time and then it should show up in Base DB Templates. On the Base DB Template section right click on the newly created template and select “Validate”. This one creates a test database to see if it is successful or not before it can be rolled out to the Organizations and Database groups . Once the validation is a success, right click and select “Assign it to the resource bundle” and select the resource bundle that will have access to the template or create a new resource bundle to use the template.

Note in the Resource bundle creation steps lies yet another new way to separate out IO on separate datastores since these may need different IO characteristics

The above image shows how the various types of datastores for your OS, Backup, Data and Logs (Data can be multiple location if you need multiple tablespaces)

Now all the Orgs/Database groups using the resource bundle will see the new vPostgres 9.2 template.

In a sample demo I created a database called MyDB using few wizard questions.

Now on the DB List we do a right click on the database to create more replicas

Couple of more quick wizard questions and a slave is ready. Similarly you can create more slaves of the same master. Now actually behind the scenes it does all the plumbing right from creating a new clone virtual machine from the master and setting it up using standard PostgreSQL replication on the newly created VM but from a user point of view it it is all point , click and done.

The database dashboard also has a new portlet to show bit more information about the replication.

To make a slave as a new master and it will give an option to move all the other slaves to the new master select as follows:

After completion of the failover, the status shown will be similar to:

Where the original slave replica is now redirected to the new Master. The original master is kept as it is for Port-Mortem or other activities. This also gives a way to use it as new use cases like separating a slave for test purpose/DR purpose or separating “mirrors” etc. Best of all it uses all Standard PostgreSQL replication.

Of course one of the thing that it does not do is re-purpose the original Master as a slave of the new Master. For more details there is a perfect opportunity to find Heikki Linnakangas from VMware at PGCon in Ottawa this week and ask him the question after his session .. WHY??? :-)

Tuesday, May 14, 2013

How can PostgreSQL 9.3 beta1 help you?

PostgreSQL 9.3 beta1 is now available. Giving early access to software is always a good idea to test out evolutionary, revolutionary, radical ideas because unless it is field tested, it has not gone through its trial by fire to be proven gold.

There are many new changes introduced in PostgreSQL 9.3 beta1 and I do have few favorites in them.

For example Disk page checksums to detect filesystem failures. In fact this would allow VMware to use the now standard disk page checksum instead of a custom feature. This highly debated feature is required to identify silent bit corruptions (or deter malicious ones). I have been told in talks with database administrators (not just PostgreSQL DBAs) that typically in a year they would face one such incident atleast where one of the disk would show such a bit rot which goes unnoticed without any instrumentations to catch it.

Another change that goes in the right direction is how PostgreSQL maps the shared memory. This small change now allows no kernel changes to be done to start the database with a bigger shared buffer pool. This now allows one less cookbook step to be done to get the database working. Considering that in this cloud world where there are 100,000 VMs running databases one less step is a huge increase in productivity since this step actually required privileges higher than the database instance owner.

Yet another favorite feature is the custom background workers. This new mechanism is certainly a popular one in our team at VMware where are using it heavily to move some of the changes that we had done into custom background workers deployed as extensions and allowed us to align with core PostgreSQL and extra features enabled as extensions as needed.

Next I want to talk of three features : Writeable Foreign Tables and pgsql Foreign Data Wrapper and Automatically update VIEWs together. These features on its own itself are very useful and generic. However when used together it actually opens new possibilities using multiple federated PostgreSQL databases shards with single logical view of the whole database as one. Quite Powerful if you think about it. I hope to see people trying these fundamental features into new derived features now made possible.

Also new JSON functions help PostgreSQL on its evolution to be the Data Platform not just for relational data but also document data.

While I have barely scratched the surface of all the new features in PostgreSQL 9.3 beta1, I am already excited with this release and the possibilities I see going forward in the world of data.

Wednesday, March 13, 2013

PyPgDay 2013 - HA / Replication for PostgreSQL in Virtualized Environments

My presentation at PyPgDay 2013 on HA and Replication for PostgreSQL on Virtualized Environments

Thursday, November 15, 2012

A Step Forward

Recently I upgraded my "lab" setup.
Now it currently looks as follows:

* 2x Physical Hosts running vSphere 5.1
* Controlled by vCenter 5.1 Server Appliance backed with embedded Postgres Database instance
* Monitored by vCenter Operations which has two embedded Postgres Database instance in the vApp
*To monitor my VMs, I installed vFabric Hyperic 5.0 also running embedded Postgres Database VM.
* DBaaS provider vFabric Data Director 2.5 also installed with its embedded Postgres database instance running too

Now for my VMs:
* vFabric Postgres 9.1.6.0 VM integrated with vSphere HA
* My Linux Developer VM running PostgreSQL 9.2.1

If you get everything you need with Postgres, why even QA for other databases anymore?
Well vFabric Hyperic took the first bold step forward.

Wednesday, October 10, 2012

SFPUG Oct 2012: DVDStore Benchmark and PostgreSQL

I presented the DVDStore Benchmark at SFPUG yesterday night.
The presentation is available at http://portal.sliderocket.com/vmware/SFPUG_DVDStore
The video is located at http://www.justin.tv/sfpug/b/335041570

Wednesday, September 19, 2012

PGOpen 2012: DVDStore Benchmark and PostgreSQL

My slides to go with the Demo I did with DVDStore Benchmark and PostgreSQL session.
http://portal.sliderocket.com/vmware/PGOpen2012_DVDStore

Other session presentations are available at https://wiki.postgresql.org/wiki/Postgres_Open_2012

Friday, July 20, 2012

Postgres @ VMware

Recently VMware vCenter Server Appliance 5.0 u1a was released with embedded Postgres based distribution. Check out the release notes.

Quote from release notes:
"vCenter Server Appliance Database Support: The DB2 express embedded database provided with the vCenter Server Appliance has been replaced with VMware vPostgres database. This decreases the appliance footprint and reduces the time to deploy vCenter Server further."

vCenter Server Appliance joins the growing list of VMware products embedding and/or supporting Postgres.

Friday, May 18, 2012

PgCon 2012: OLTP Performance Benchmarks Overview

The slides from my presentation today.

or access it on the PGCon Website.