RDS

Posted on Posted in Uncategorized

Amazon Relational Database Service

Services:

  • RDS Databases covered:
    • Maria
    • Oracle
    • SQL Server
    • Postgre SQL
    • Aurora
    • MySQL
  • Dynamo DB – NO SQL databases
    • MemCache
    • Redis
  • Redshift – Data Warehousing
  • Database Migration Service (DMS)

Types of Processing:

  • Online Transaction Processing (OLTP)
    • Standard processing
    • Transactional data ie get one order detail
  • Online Analytics process (OLAP)
    • more complex computation
    • query is done on a copy as to not disturb production

Database Backups:

  • Automated Backup
    • Retention period – default is 7 days, up to 35 days
    • saved on S3
    • on by default
  • DB-Snapshots – manually turned on
  • Encryption at rest supported using KMS
  • Encrypting existing DB currently not supported
  • In RDS, changes to the backup window take effect Immediately

Multi Availability Zone Replication – replicate DB to another AZ for DR not for performance, this enables auto failover from geographic location to another.

Exam NOTES:

  • Dynamo DB can scale on the fly, RDS cannot.
  • RDS can scale on read not on write (Read Replica)
  • You CANNOT RDP or SSH to RDS instance
  • At the present time, encrypting an existing DB Instance is not supported. To use Amazon RDS encryption for an existing database, create a new DB Instance with encryption enabled and migrate your data into it. (manually import data)

Read Replica on RDS:

  • Replicate RDS on different RDS
  • Async replication is done
  • Replicas are only read copies, and each EC2 instance can connect to a replica
  • Automated backup is required for read replica
  • replicas can have replicas (this will have performance implications)
  • NO multi AZ
  • replicas can be promoted to their own DBs

Dynamo DB features  – No SQL services from AWS

  • consistent latency on scale
  • stored on SSD (store sdata in partition)
  • 3 geographic locations
  • Eventual consistency (1 second), strong consistent read (
  • cheaper on read than write
  • push button scaling
  •  DynamoDB is automatically redundant across multiple availability zones.
  • terms:
    • Tables – similar to RDS
    • Items – similar to columns, items is a group of attributes that represent one object in a table, ie a person table, each item would represent one person
      • 400 KB limit on item size
    • Attributes – each item are composed attribute ie. FirstName, LastName, Age etc
  • Primary Key
    • Partition Key – simple PK hash function to identify items
    • Partition key and sort key – composite pk which parition key as above and sort key which is sorted by sort key value
  • API : Control plane (CRUD tables), Data plane (CRUD Data), DynamoDb Streams
  • Data Types – Scalar (number, String, binary, boolean, null) Document (JSON document [list/map]), Set Type (String set, number set, binary set)
  • Read Consistency
    • Eventually Consistent Reads
    • Strongly Consistent Reads
  • Provisioned throughput: 1 read unit = 4kb, 1 write unit 1kb
  • Batch Operations: you can get up to 16mb of data, which can be as many as 100 items
  • Conditional Writes – prevents concurrent overwrite of same row by using ReturnConsumedCapacity
  • Scan – as opposed to query, scan returns all of the data attributes for every item in a table or index, maximum of 1mb
    • Scan uses sequential scanning, you can use parellel scan by using segments for faster retrieval of large amounts of data

Redshift – Data warehousing mechanism from AWS

  • Typical database block sizes range from 2 KB to 32 KB. Amazon Redshift uses a block size of 1 MB
  • Single node can be up to 160gb
  • multinodes can contain up to 128 nodes
    • leader node– receives query
    • compute node – performs query
      • Node slices – nodes are partitioned into slices, each slice a portion of memory and disk space is allocated to do the workload

Performance:

  • Columnar Data Storage – column based storage (RDS is row based)
  • Can handle large data set with fewer I/O
  • Advanced Data Compression – 10 times faster than RDS
    • columns are all the same type hence it is faster than RDS
    • no indexes/views
  • Massive Parallel Processing (MPP)
  • Query optimizer
  • Compiled Code – leader node compiles the code already eliminating overhead of interpreting code and distributes across all nodes

ElastiCache  – Caching mechanism of AWS

Features:

  • Automatic detection and recovery from cache node failures.
  • Automatic failover (Multi-AZ) of a failed primary cluster to a read replica in Redis replication groups.
  • Flexible Availability Zone placement of nodes and clusters.
  • Integration with other AWS services such as Amazon EC2, CloudWatch, CloudTrail, and Amazon SNS to provide a secure, high-performance, managed in-memory caching solution.

Engines

  • MemCached
    • memory object caching
    • not AZ
  • Redis
    • key/value pair storing
    • sets/list
    • master – slave replicatin
    • AZ

AuroraDB – Amazon created SQL, this is always recommended when using AWS cloud computing database

  • scaling capability
  • 3 Az with 2 copies – total of 6 copies stored by default (highly durable)
  • replication is done immediately and also Free

One thought on “RDS

Leave a Reply