Automate Puppeteer Deployments on AWS Fargate with GitHub Actions: A Complete Guide


Headless browser automation using Puppeteer is essential for tasks like web scraping, end-to-end testing, and content rendering. However, deploying Puppeteer at scale requires a serverless solution that minimizes overhead and simplifies operations. AWS Fargate, in combination with GitHub Actions, provides a powerful, scalable, and automated pipeline for deploying Puppeteer applications.

In this guide, we'll learn how to package, deploy, and automate Puppeteer deployments using Docker, AWS Fargate, and GitHub Actions.


 Prerequisites

Before starting, ensure the following:

  • An AWS account with ECS and ECR permissions

  • AWS CLI configured locally and in GitHub

  • A GitHub repository for your Puppeteer project

  • Docker installed locally

  • Basic understanding of ECS Fargate, GitHub Actions, and Docker


 Step 1: Dockerize Your Puppeteer App

Create a Dockerfile optimized for Puppeteer:


FROM node:18-slim


# Install required dependencies

RUN apt-get update && apt-get install -y \

    wget \

    ca-certificates \

    fonts-liberation \

    libappindicator3-1 \

    libasound2 \

    libatk-bridge2.0-0 \

    libatk1.0-0 \

    libcups2 \

    libdbus-1-3 \

    libnss3 \

    libx11-xcb1 \

    libxcomposite1 \

    libxdamage1 \

    libxrandr2 \

    xdg-utils \

    --no-install-recommends && \

    rm -rf /var/lib/apt/lists/*


# Create app directory

WORKDIR /usr/src/app


# Install app dependencies

COPY package*.json ./

RUN npm install


# Bundle app source

COPY . .


# Run Puppeteer script

CMD ["node", "index.js"]


Don't forget to expose any necessary ports (if applicable) and handle puppeteer.launch options for headless mode on a Linux container.


 Step 2: Push Docker Image to AWS ECR

  1. Create ECR Repository:


aws ecr create-repository --repository-name puppeteer-app


  1. Authenticate Docker to ECR:


aws ecr get-login-password | docker login --username AWS --password-stdin <your-account-id>.dkr.ecr.<region>.amazonaws.com


  1. Build and Push Image:


docker build -t puppeteer-app .

docker tag puppeteer-app:latest <your-repo-uri>:latest

docker push <your-repo-uri>:latest



 Step 3: Deploy on AWS Fargate

  1. Create ECS Task Definition with the Fargate launch type.

  2. Configure Container Settings:

    • Image: Your ECR image URI

    • Memory: 512 MiB+

    • CPU: 256 units+

    • Set any environment variables your script needs.

  3. Create ECS Cluster and Service:

    • Use the Fargate launch type.

    • Set up VPC and subnet configurations.

    • Auto-assign a public IP if accessing the internet

  4. Test Deployment: Trigger the service and validate the output logs via CloudWatch.


 Step 4: Automate with GitHub Actions

Create .github/workflows/deploy.yml in your repository:


name: Deploy Puppeteer to AWS Fargate


on:

  push:

    branches: [main]


jobs:

  build-deploy:

    runs-on: ubuntu-latest


    steps:

    - name: Checkout code

      uses: actions/checkout@v3


    - name: Set up Docker

      uses: docker/setup-buildx-action@v2


    - name: Configure AWS credentials

      uses: aws-actions/configure-aws-credentials@v2

      with:

        aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}

        aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}

        aws-region: us-east-1


    - name: Log in to Amazon ECR

      id: login-ecr

      uses: aws-actions/amazon-ecr-login@v1


    - name: Build, tag, and push image

      run: |

        IMAGE_URI=${{ steps.login-ecr.outputs.registry }}/puppeteer-app:latest

        docker build -t $IMAGE_URI .

        docker push $IMAGE_URI


    - name: Deploy to ECS

      uses: aws-actions/amazon-ecs-deploy-task-definition@v1

      with:

        task-definition: puppeteer-task-def.json

        service: puppeteer-service

        cluster: puppeteer-cluster

        wait-for-service-stability: true


 Use GitHub Secrets to store AWS credentials and environment variables securely.


 Optional: Trigger via Manual Dispatch or Schedule

You can extend your workflow to include:

  • workflow_dispatch for manual deployments

  • schedule with cron expressions for periodic Puppeteer runs

Example:


on:

  schedule:

    - cron: '0 0 * * *'  # Every day at midnight UTC



 Benefits of This Setup

  • Serverless scaling with AWS Fargate

  • CI/CD automation with GitHub Actions

  • Headless browser support with Puppeteer in Docker

  • Easy maintenance via version-controlled GitHub workflows


 Conclusion

This automation pipeline simplifies the deployment of Puppeteer workloads using modern DevOps practices. Whether you're scraping data, performing automated testing, or rendering content, combining Puppeteer, AWS Fargate, and GitHub Actions offers a robust and scalable solution with minimal infrastructure overhead.

Comments

Popular posts from this blog

Podcast - How to Obfuscate Code and Protect Your Intellectual Property (IP) Across PHP, JavaScript, Node.js, React, Java, .NET, Android, and iOS Apps

YouTube Channel

Follow us on X