Greenplum 6.7.1 on AWS

How to install a three-node Greenplum Database cluster with segment mirroring

Source: https://twitter.com/Greenplum

Overview

Introduction

Source: https://gpdb.docs.pivotal.io/6-7/admin_guide/intro/arch_overview.html

System Requirements

Part 1 — Create AWS Instances

AWS Instance Details:

Sample inbound rules (personal IP address edited out)

Part 2—Setup on Each Node

Source: https://github.com/greenplum-db/gpdb/releases
# Master Node
sudo hostnamectl set-hostname master;
sudo reboot;
# Segment Node 1
sudo hostnamectl set-hostname seg1;
sudo reboot;
# Segment Node 2
sudo hostnamectl set-hostname seg2;
sudo reboot;
Sample lines for /etc/hosts file
Set “PasswordAuthentication” to “yes”
sudo service sshd restart
# Master Node
sudo mkdir -p /gpdata/master;
sudo mkdir -p /gpdata/mirror
# Segment Nodes
sudo mkdir -p /gpdata/primary;
sudo mkdir -p /gpdata/mirror
sudo dpkg -i greenplum-db-6.7.1-ubuntu18.04-amd64.deb
sudo groupadd gpadmin;
sudo useradd gpadmin -r -m -g gpadmin;
sudo chsh -s /bin/bash gpadmin;
sudo passwd gpadmin
sudo chown -R gpadmin:gpadmin /gpdata;
sudo chown -R gpadmin:gpadmin /usr/local/greenplum-db-6.7.1
su gpadminssh-keygen -t rsa -b 4096 
# Accept defaults; Do not input a password
source /usr/local/greenplum-db-6.7.1/greenplum_path.sh
echo "source /usr/local/greenplum-db-6.7.1/greenplum_path.sh; export MASTER_DATA_DIRECTORY=/gpdata/master/gpseg-1; cd ~" >> ~/.bashrc; source ~/.bashrc
Sample optional lines for .bashrc for gpadmin

Part 3—Initialize Greenplum Database

ssh-copy-id masterssh-copy-id seg1ssh-copy-id seg2
Sample run of ssh-copy-id
Sample hostlist file
gpssh-exkeys -f hostlist
cp /usr/local/greenplum-db-6.7.1/docs/cli_help/gpconfigs/gpinitsystem_config ./
declare -a DATA_DIRECTORY=(/data1/primary /data1/primary /data1/primary /data2/primary /data2/primary /data2/primary)
declare -a DATA_DIRECTORY=(/gpdata/primary /gpdata/primary /gpdata/primary /gpdata/primary)
# The above line forces the creation of four segments per segment nodes because the location "/gpdata/primary" appears four times.
# Any locations or combination of locations can be added here.
# Appropriate segmentation is addressed in the Pivotal Docs
here.
MASTER_HOSTNAME=mdw
MASTER_HOSTNAME=master
MASTER_DIRECTORY=/data/master
MASTER_DIRECTORY=/gpdata/master
#DATABASE_NAME=name_of_database
DATABASE_NAME=testdatabase_1
#MACHINE_LIST_FILE=/home/gpadmin/gpconfigs/hostfile_gpinitsystem
MACHINE_LIST_FILE=/home/gpadmin/hostlist_segonly
Sample gpinitsystem_config file
gpinitsystem -c gpinitsystem_config
Greenplum initialization success message
gpaddmirrors -p 10000
gpstate -b
Sample gpstate run
/usr/local/greenplum-db-6.7.1/bin/psql -h master -d testdatabase_1 -U gpadmin
Accessing test database

A man with a passion for information technology.