How I built a DALL-E Clone using MERN

How I built a DALL-E Clone using MERN

GitHub Repository: https://github.com/PSkinnerTech/aureliusai-art

Artificial intelligence (AI) is transforming the way we interact with technology, and the possibilities are limitless. One of the most exciting developments in the field of AI is the creation of image generation models like DALL-E. DALL-E, developed by OpenAI, can generate images from textual prompts, creating images that have never existed before. In this blog, I will be discussing how I built a DALL-E image generator clone using React JS, ExpressJS, MongoDB, Cloudinary, and Node.js.

Note: I later plan on building in an option to use DALL-E or MidJourney as the AI model. When I do, I'll be sure to edit this blog with an update. (Mar 3rd, 2023)

Building the Frontend

Vite.js

To start, I built the frontend of the application using React JS. Instead of directly using "create-react-app," I decided to experiment with Vite.js as the developmental environment. Vite.js allowed me to set up a development environment for frameworks like React and Vue and even for Vanilla JavaScript apps with a dev server. Moreover, it allowed me to hot reload in just three commands. Vite supports Rollup.js internally for bundling, making it a powerful tool for building large-scale applications.

Tailwind.css

Once I set up Vite.js, I set up my CSS framework with Tailwind.css. Tailwind CSS is an open-source CSS framework. The main feature of this library is that, unlike other CSS frameworks like Bootstrap, it does not provide a series of predefined classes for elements such as buttons or tables. Instead, it provides a set of utility classes that can be used to style any element. This makes it more flexible and customizable than other CSS frameworks.

After setting up my app and CSS framework, I built my src folder. This included three main folders: assets, components, and pages. The assets folder contains all of my image assets, the components folder contains all of my major page snippets (Image Cards, Image Generator's Form Field, and the loading features), and the pages folder contains my two main pages (Home.jsx and CreateImage.jsx).

Building the Backend

Once I had all of my UI built for my image generator, I started building into the backend (server folder). I set up my database using MongoDB, my image storage for my community page using Cloudinary, and then my OpenAI API.

MongoDB

Setting up MongoDB is a very easy and straightforward process. MongoDB is a NoSQL database, which means it stores data in a JSON-like format. It is scalable and flexible, making it an excellent choice for applications that require large amounts of data.

Cloudinary

Setting up Cloudinary was a bit more challenging. Cloudinary is a cloud-based image and video management solution that provides a range of features for managing, manipulating, and delivering images and videos. It took me some time to figure out how to set up the Cloudinary API and how to use it with my application. However, once I had it set up, it was a breeze to use.

OpenAI: The Most Painful Part

OpenAI has been the most challenging part of building this application. Despite being used by millions, OpenAI still has quite a few issues. It wasn't until I signed up for their API status notifications that I realized the API probably goes down almost every other day, if not daily. One of the first issues that didn't have an immediately obvious fix was when I started getting a "402: billing ha" error every time I tried to generate an image. It took me two days to realize that the $18 of free API calls you receive when creating your OpenAI account had expired.

Update Request?

I really wish there was something more obvious in your account to let you know that your trial balance was expired. To check if your balance is expired, you have to go into your account's billing to find it. If it is expired, just create a new account and set up an API Key with the new account.

Connecting the Frontend and Backend

Once I had completed building the frontend and backend separately, it was time to connect the two. This requires a bit of troubleshooting, but it is not too difficult if you know what you're doing.

First, I needed to install the axios package in my frontend to make HTTP requests to the backend. Axios is a promise-based HTTP client that makes it easy to send asynchronous HTTP requests to REST endpoints and perform CRUD operations.

After installing Axios, I created a config file that contained the base URL of my backend server. This allowed me to make requests to my backend server from my frontend.

Next, I had to create the endpoints in my backend server that would handle the requests from the frontend. I created two endpoints: one to handle requests to generate an image and one to handle requests to save an image to the database.

The dalleRoutes.js handles image generation:

import express from 'express';
import * as dotenv from 'dotenv';
import { Configuration, OpenAIApi } from 'openai';

dotenv.config();

const router = express.Router();

const configuration = new Configuration({
  apiKey: process.env.OPENAI_API_KEY,
});

const openai = new OpenAIApi(configuration);

router.route('/').get((req, res) => {
  res.status(200).json({ message: 'Hello from DALL-E!' });
});

router.route('/').post(async (req, res) => {
  try {
    const { prompt } = req.body;

    const aiResponse = await openai.createImage({
      prompt,
      n: 1,
      size: '1024x1024',
      response_format: 'b64_json',
    });

    const image = aiResponse.data.data[0].b64_json;
    res.status(200).json({ photo: image });
  } catch (error) {
    console.error(error);
    res
      .status(500)
      .send(error?.response.data.error.message || 'Something went wrong');
  }
});

export default router;

The postRoutes.js file handles storing the image to Cloudinary and posting them on the Community Page:

import express from 'express';
import * as dotenv from 'dotenv';
import { Configuration, OpenAIApi } from 'openai';

dotenv.config();

const router = express.Router();

const configuration = new Configuration({
  apiKey: process.env.OPENAI_API_KEY,
});

const openai = new OpenAIApi(configuration);

router.route('/').get((req, res) => {
  res.status(200).json({ message: 'Hello from DALL-E!' });
});

router.route('/').post(async (req, res) => {
  try {
    const { prompt } = req.body;

    const aiResponse = await openai.createImage({
      prompt,
      n: 1,
      size: '1024x1024',
      response_format: 'b64_json',
    });

    const image = aiResponse.data.data[0].b64_json;
    res.status(200).json({ photo: image });
  } catch (error) {
    console.error(error);
    res
      .status(500)
      .send(error?.response.data.error.message || 'Something went wrong');
  }
});

export default router;

Testing APIs with Postman

Once I had created the endpoints, I tested them using Postman to make sure that they were working properly. Postman is a popular tool used for testing API endpoints.

With the endpoints working properly, it was time to integrate them into the frontend. I created a form in the frontend that would allow users to input the text prompt for the image they wanted to generate. When the form was submitted, the input was sent to the backend server using Axios.

When the backend server received the request, it used the OpenAI API to generate an image based on the user's input. The image was then saved to Cloudinary and the URL was returned to the frontend. The frontend then displayed the generated image to the user.

"Surprise Me"

I also built a "Surprise Me" feature, that allowed a user to click "surprise me" for a randomly generated prompt rather than creating their own prompt. This is a simple list of 50 prompt options that are selected using a Math.random() function.

import FileSaver from 'file-saver';

import { surpriseMePrompts } from '../constants';

export function getRandomPrompt(prompt) {
  const randomIndex = Math.floor(Math.random() * surpriseMePrompts.length);
  const randomPrompt = surpriseMePrompts[randomIndex];

  if (randomPrompt === prompt) return getRandomPrompt(prompt);

  return randomPrompt;
}

export async function downloadImage(_id, photo) {
  FileSaver.saveAs(photo, `download-${_id}.png`);
}

Conclusion

Building this DALL-E Image Generator Clone was a great learning experience. It allowed me to work with some of the most popular and cutting-edge technologies in web development and gave me a taste of what it's like to build an AI-powered app.

The potential for AI-powered apps and scalable SaaS companies is immense, and I look forward to seeing what innovative solutions developers will come up with in the years to come.