Posts Tagged aws

Quick and Dirty Shoestring Startup Infra

At the University of Waterloo, we have a Final Year Design Project/Capstone project. My group is working on a conference management suite called Calligre. We’ve been approaching it as kind of a startup – we presented a pitch at a competition and won! While sorting out admin details with the judges after, they were oddly impressed that we had email forwarding for all the group members at our domain. Apparently it’s pretty unique.

In the interest of documenting everything I did both for myself, and other people to refer to, I decided to write down everything that I did.

Note that we’re students, so we get a bunch of discounts, most notably the Github Student Pack. If you’re a student, go get it.

Domain

  1. Purchase a domain. NameSilo is my go-to domain purchaser because they have free WHOIS protection, and some of the cheapest prices I’ve seen.
    Alternatives to NameSilo include NameCheap and Gandi. Of the two, I prefer Gandi, since they don’t have weird fees, but Namecheap periodically has really good promos on that drop a price significantly.
  2. Use a proper DNS server. Sign up for the CloudFlare free plan – not on your personal account, but a completely new one. CloudFlare doesn’t have account sharing for the free plan yet, so Kevin and I are just using LastPass to securely share the password to the account. For bonus points, hook CloudFlare up to Terraform and use Terraform to manage DNS settings.
    Alternatives include DNSimple (2 years free in the GitHub Student Pack!) and AWS Route 53.
  3. Sign up for Mailgun – they allow you to send 10000 messages/month for free. However, if you sign up with a partner (eg Google), they’ll bump that limit for you. We’re sitting at 30000 emails/month, though we needed to provide a credit card to verify that we were real people.
    Follow the setup instructions to verify your domain, but also follow the instructions to recieve inbound email. This allows you to setup routes.
    Alternatives include Mailjet (25k/month through Google), and SendGrid (15k emails/month through the Github Student Pack) – though SendGrid doesn’t appear to do email forwarding, they will happily take incoming emails and post them to a webhook
  4. Once you have domains verified and email setup, activate email forwarding. Mailgun calls this “Routes”. We created an address for each member of the team, as well as contact/admin aliases that forward to people as appropriate. I recommend keeping this list short – you’ll be managing it manually.

Hosting

  1. We currently have a basic landing page. This is still in active development, so we use GitHub Pages in conjunction with a custom domain setup until it’s done. This will eventually be moved to S3/another static site host. For now though, it’s free.
  2. Sign up for a new AWS account.
  3. Register for AWS Educate using the link in the Github Student Pack. This gets you $50 worth of credit (base $35 + extra $15). Good news for uWaterloo people: I’ve asked to get uWaterloo added as a member institution to AWS Educate, so we should be getting an additional $65 of credit when that happens.
    Note that if you just need straight hosting, sign up for Digital Ocean as well – Student Pack has a code for $50/$100 of credit!

AWS

  1. In AWS, create an IAM User Account for each user. I also recommend using groups to assign permissions to users, so you don’t need to duplicate permissions for every single user. I just gave everyone in our team full admin access, we’ll see if that’s still a good idea 6 months down the road…
  2. Change the org alias so people don’t need to know the 12 digit account ID to login.
  3. Enable IAM access to the billing information, so you don’t have to flip between the root account & your personal IAM account whenever you want to check how much you’ve spent.
  4. Enable 2 factor auth on the root account at the very least. Let another person in the team add the account to their Google Authenticator/whatever, so you’re not screwed if you have your phone stolen/otherwise lose it.

More stuff as and when I think about it.

,

No Comments

Notes from various AWS Investigations

  • AWS CloudWatch Logs storage charge == S3 storage charge. Possibly less, since the logs are gziped level 6 first.
  • CW Logs makes more sense than using AWS Elasticsearch at small scale – prices start at 1.8c an hour + EBS charges vs 50c/GB of log ingestion + storage
  • For pure log storage & bulk retrival, S3 makes far more sense than either ElasticSearch or CloudWatch Logs. B2 is ~20% of S3 though, so they make even more sense.

  • DynamoDB streams are for watching what happens to a table, and they rotate every ~24 hours, so you’d get charged on a rolling basis, and can’t delete individual events. I’m assuming events don’t disappear once you’ve processed them.

  • Cert Manager is in more zones! But only makes a difference if you hang stuff in front of an ELB. Certs for CloudFront have to still go through us-east-1.
  • API Gateway has direct integration with DynamoDB, doing an end run around Lambda functions that just insert & retrieve records (aws.amazon.com/blogs/compute/using-amazon-api-gateway-as-a-proxy-for-dynamodb/) Amusingly, models continue to not be used. (I still don’t understand what Models are supposed to do/enforce)
  • DynamoDB cross-region replication is weird. You spin up an EC2 instance that handles it for you. I wonder if the DynamoDB team will work on managed replication…
  • DynamoDB is stupid cheap, and it makes sense for me to migrate the vast majority of my DB centric stuff to it.
  • CloudFront has a weird “$0.000 per request – HTTP or HTTPS under the global monthly free tier” for requests, and I’m not sure why. My account is long out of the free tier.

No Comments

Using Amazon S3 + CloudFront + Certificate Manager to get seamless static HTTPS support

TL;DR: This post documents the process I took to get S3 to return redirect requests over HTTP + HTTPS to a given domain.

I’m trying to trim down the number of domains and subdomains that I host on my server, since I’m trying a new policy of moving servers every few months in an attempt to make sure I automate everything I can.

One of the things that I’ve done was to start consolidating static files under a single subdomain, and use 301 redirects in nginx to point to the new location. Thanks to Ansible, rolling out the redirect config is a matter of adding a new domain + target pair to a .yml file and running it against a server.

But it’d still be nice if I didn’t have as many moving parts – which meant that I looked at ways to get an external provider to host this for me. I decided to try getting S3’s static website hosting a try to see if it supported everything I wanted it to do. In this case, I want to return either a redirect to a fixed URL, or a redirect to a different domain, but same filename. Essentially, my nginx redirect configs are either return 301 https://kyle.io/$request_uri or return 301 https://kyle.io/fixed-location.

For the purposes of this post, let’s assume I have the domain tw.kyle.io that redirects to my Twitter profile.

Creating the S3 Bucket

I knew that S3 could do static site hosting – but the docs seem to indicate that while redirecting to a different domain is possible, it will still use the same path to the file. So this would work for the different domain, same file name case, but not the fixed URL case.

I found my solution in an example of redirecting on an HTTP 404 error – but it could be adjusted to redirect all the time by removing the Condition elements.

So let’s create the bucket with the AWS CLI: aws s3api create-bucket --bucket tw-kyle-io --create-bucket-configuration LocationConstraint=us-west-2

Then I had to apply the redirection rules. The CLI uses JSON format to specify the redirect rules, so to make things simple, I dumped the config into a file:

{
  "IndexDocument": {
    "Suffix": "redirect.html"
  },
  "RoutingRules": [
    {
      "Redirect": {
        "HostName": "twitter.com",
        "Protocol": "https",
        "ReplaceKeyWith": "lightweavr"
      }
    }
  ]
}

and then called it through the CLI: aws s3api put-bucket-website --bucket tw-kyle-io --website-configuration file://website.json. Note that the use of IndexDocument is misleading: I never actually created a file in S3, but the S3 API requires that something be specified for that key.

People who have created a static website in S3 might be saying that I used a bad bucket name, because now I can’t use CNAMEs with the bucket. Well, I have the useful experience of writing this after discovering that HTTPS requests to the bucket don’t work because S3 doesn’t support HTTPS to the website endpoint. CloudFront doesn’t have restrictions on bucket naming, especially where we’re going to use the website endpoint so we can use redirections. (Hat tip to a StackOverflow question and another one as well.)

Amazon Certificate Manager

So I ended up using CloudFront. But first, I had to create a certificate with ACM. Because I’m creating a subdomain cert, I had to use the CLI – the console only allows you to create a *.domain.com or domain.com cert.

aws acm request-certificate --domain-name tw.kyle.io --domain-validation-options DomainName=tw.kyle.io,ValidationDomain=kyle.io

I had to approve the certificate creation, which required waiting for the email. If you try to create you get an error about the cert not existing: A client error (InvalidViewerCertificate) occurred when calling the CreateDistribution operation: The specified SSL certificate doesn't exist, isn't valid, or doesn't include a valid certificate chain.

CloudFront Magic

Then, it was time to configure the CloudFront distribution. Which isn’t trivial, there’s a bunch of options, and it wouldn’t surprise me if I screwed something up. To get the options below, I created a distribution through the AWS console, then compared that against a generated template (aws cloudfront create-distribution --generate-cli-skeleton).

I put the following in a file named cf.json

{
    "DistributionConfig": {
        "CallerReference": "tw-kyle-io-20160402",
        "Aliases": {
            "Quantity": 1,
            "Items": [
                "tw.kyle.io"
            ]
        },
        "DefaultRootObject": "",
        "Origins": {
            "Quantity": 1,
            "Items": [
                {
                    "DomainName": "tw-kyle-io.s3-website-us-west-2.amazonaws.com",
                    "Id": "tw.kyle.io-redirect",
                    "CustomOriginConfig": {
                        "HTTPPort": 80,
                        "HTTPSPort": 443,
                        "OriginProtocolPolicy": "http-only",
                        "OriginSslProtocols": {
                            "Quantity": 1,
                            "Items": [
                                "TLSv1.2"
                            ]
                        }
                    }
                }
            ]
        },
        "DefaultCacheBehavior": {
            "TargetOriginId": "tw.kyle.io-redirect",
            "ForwardedValues": {
                "QueryString": false,
                "Cookies": {
                    "Forward": "none"
                },
                "Headers": {
                    "Quantity": 0
                }
            },
            "TrustedSigners": {
                "Enabled": false,
                "Quantity": 0
            },
            "ViewerProtocolPolicy": "allow-all",
            "MinTTL": 86400,
            "AllowedMethods": {
                "Quantity": 2,
                "Items": [
                  "HEAD",
                  "GET"
                ],
                "CachedMethods": {
                    "Quantity": 2,
                    "Items": [
                      "HEAD",
                      "GET"
                    ]
                }
            },
            "SmoothStreaming": false,
            "DefaultTTL": 86400,
            "MaxTTL": 31536000,
            "Compress": true
        },
        "Comment": "Redirect for tw.kyle.io",
        "Logging": {
            "Enabled": false,
            "IncludeCookies": false,
            "Bucket": "",
            "Prefix": ""
        },
        "PriceClass": "PriceClass_100",
        "Enabled": true,
        "ViewerCertificate": {
            "ACMCertificateArn": "arn:aws:acm:us-east-1:111122223333:certificate/3f1f4661-f01b-4eef-ae0d-123412341234",
            "SSLSupportMethod": "sni-only",
            "MinimumProtocolVersion": "TLSv1",
            "Certificate": "arn:aws:acm:us-east-1:111122223333:certificate/3f1f4661-f01b-4eef-ae0d-123412341234",
            "CertificateSource": "acm"
        }
    }
}

and then ran the command aws cloudfront create-distribution --cli-input-json file://cf.json.

Testing it

The first thing to test was simply curlling the newly created distribution – after waiting ages for it to actually be created. (I assume CloudFront is just continually working through a queue of distribution changes to each of the 40+ POPs, so it’s actually quite awesome.)

[[email protected] ~]$ curl -v d28h66yisj403x.cloudfront.net
* Rebuilt URL to: d28h66yisj403x.cloudfront.net/
>clipped<
< HTTP/1.1 301 Moved Permanently
< Location: https://twitter.com/lightweavr
< Server: AmazonS3
< X-Cache: Miss from cloudfront
* Connection #0 to host d28h66yisj403x.cloudfront.net left intact
[[email protected] ~]$ curl -v https://d28h66yisj403x.cloudfront.net
* Rebuilt URL to: https://d28h66yisj403x.cloudfront.net/
>clipped<
< HTTP/1.1 301 Moved Permanently
< Location: https://twitter.com/lightweavr
< Server: AmazonS3
< Age: 11
< X-Cache: Hit from cloudfront

Would you look at that, both the HTTP & HTTPS connections worked!

So I changed the CNAME in CloudFlare to point to the new distribution, and tried again, this time with tw.kyle.io.

[[email protected] ~]$ curl -v https://tw.kyle.io
* Rebuilt URL to: https://tw.kyle.io/
>clipped<
< HTTP/1.1 301 Moved Permanently
< Location: https://twitter.com/lightweavr
< Age: 577
< X-Cache: Hit from cloudfront
< Via: 1.1 f360bbb3d1999b5324e1d7ae31da1d7e.cloudfront.net (CloudFront)
< X-Amz-Cf-Id: kl8eZMDzH2BB7T3owENtjFkS2xtfcwqoOsZ4-SNxLY8LMdupbXrp9Q==
< X-Content-Type-Options: nosniff
< Server: cloudflare-nginx
< CF-RAY: 28d9413b3943302a-YYZ

Because I use CloudFlare’s caching layer, parts of the response are overwritten by CloudFlare (notably the Server section, among others). But we can still see that the request ultimately hit the CloudFront frontend. The request was still redirected, so everything looks good.

In Closing

The main thing I didn’t like about this setup? The obtuse documentation. I had far better grasp of everything working through the console, then applying that to the CLI. But there’s still tiny things.

  1. I hit the ancient ‘Conflicting Conditional Operation’ issue about S3 buckets being quick to be deleted from the console, but not actually deleted in the backend. So when I mistakenly thought that --region us-west-2 would be enough to get the S3 bucket to be created in us-west-2, it took ~1 hour to become useable again.

  2. Passing --region us-west-2 to aws s3api create-bucket will create a S3 bucket in the default region, us-east-1. You have to specifically pass --create-bucket-configuration LocationConstraint=us-west-2 to get it created in a specific region.

  3. There’s aws s3 and aws s3api. Why aren’t the s3api operations merged into s3? No other service has an api suffix.

  4. Having to trial and error to find out what parameters are required and what aren’t for different operations. The CloudFront create-distribution command was the worst offender of the commands I ran, just with the sheer number of parameters. I’m hoping the documentation improves before it comes out of beta.

  5. Weird UI bugs/features. Main one I noticed was the ACM certificate selector not being selectable unless a cert exists… and the refresh button is included in that. So the very first time I created a CloudFront distribution, I needed to refresh the page to be able to select the cert, losing my settings.

Now for the important question: Now that I’ve set it up, will I keep on using it? The simple answer is that I’m not sure. It’s nice that it’s offloaded to another provider that keeps it going without my intervention, and that after setting it up it won’t change. But at the same time, it’s another monthly expense. I’ve already put money into a server, and the extra load of a redirection config is negligible thanks to Ansible. There is some overhead with creating & managing Let’s Encrypt certs, particularly with the rate-limiting, and these redirects are entire subdomains, but I just set up an Ansible Let’s Encrypt playbook to run weekly, and the certs will be kept up to date.

It’s going to come down to how much I get billed for this.

No Comments

Cheaping out on EC2 – using Spot Instances

Amazon markets Spot Instances as a way to reduce the price you pay for instances. So, continuing my efforts to reduce expenses on EC2, I looked into using spot instances. Spot instances are essentially just like normal instances. You can create your own AMIs, where you essentially create an image and tell Amazon to create instances based on that image, or use an existing AMI.

If you want to create an AMI, get a starting image, and customize it as necessary. I started with the Fedora 17 image. In an attempt to reduce the cost, I resized the disk from 10GB to 2GB, installed vim, less, screen and rsync, which oddly aren’t in the default Fedora install.

I then had to package it as a new AMI – this created an EBS snapshot, so I’m happy that I resized the disk. It’s a bit annoying that you’re going to be paying for an EBS snapshot AND the active EBS volumes, but in virtually all cases, the cost of the EBS snapshot won’t exceed the amount saved by using spot instances. If you have a bigger snapshot, it’ll cost more of course, but then you’d likely be using a more expensive EC2 instance, so the cost should balance out in the long run.

As for actually using the spot instance, I had my AMI set up to automatically start an IRC bot, so I used this for timing. The IRC bot came online ~7 minutes after I submitted the request to start the spot instance, so there’s a bit of lead time, but not too much. Because of the lead time, the instance won’t appear in the instance list for a while.

And an extra tip: Don’t be like me and not realise the spot instance actually started, and leave it running for two months racking up charges, only to be notified by Amazon that you now owe them money after your credit runs out. (Thankfully, they waived the charges as a one time thing.)

So now by default I set an expiry time of a day on all my spot instance requests if I know I’m only going to have them up for a few hours.

And one thing to look at if you require access to your data and can get by with using a pre-created image is using instance stores and mounting EBS volumes with the API. I didn’t try it because apparently, the t1.micro size that I’m using doesn’t support instance stores. Of course, this only really makes sense if you don’t want to pay the cost of having the spot instance run off an EBS volume. For a large scale operation, could be worth it.

, , ,

No Comments

Cheaping out on EC2 – Downsizing existing EBS volumes

I’m moving a VM from my own server to EC2. I’ve got $50 worth of credit from a Redhat event, so I would like to make use of that. (That, and my AWS account existed before the free tier was introduced, so I’m not eligible for that.) That said, I want to make that credit last as long as possible. There were two parts to this – resizing the EBS volumes so I don’t get dinged for more than I need.

Amazon says $0.10 per GB-month of provisioned storage – so I’d be paying ~80 cents/month for storage that I’m not using. Admittedly, it isn’t that much, but that’s where I looked first.

Read the rest of this entry »

, , ,

2 Comments

Removing bloat from Fedora 17 on EC2

I’m in the middle of trialing EC2, and I’m using the official Fedora 17 images kindly provided by the community.

They make a great starting point, because I can then install my needed software.

Some of which, though, I consider absolutely crucial. So far, I’ve need to install vim, less, rsync and screen. I can forgive rsync and screen, and to a certain extent less… but vim?

Especially when things like NetworkManager, ModemManager and mobile-broadband-provider-info are taking up space. Then there’s stuff like plymouth – we’re not seeing a graphical boot screen, so I should be able to erase this, right?

I hope that I don’t have to roll my own AMI image though. Just modify this one as needed, and create an AMI out of it. Read the rest of this entry »

, ,

1 Comment