This topic is a really enjoyable one for me. It’s quite common in many web applications to accept user input and save a single record to your database. But what about when your users (or you) want to perform multiple inserts in a single command?
Enter this article, which will demonstrate how to create a CSV template and a form to upload the CSV file, and how to parse the CSV into a Mongoose Model that will be saved to a MongoDB database.
This article assumes that you have a basic understanding of Mongoose and how it interacts with MongoDB. If you don’t, I would suggest reading my Introduction to Mongoose for MongoDB and Node.js article first. This article describes how Mongoose interacts with MongoDB by creating strongly-typed Schemas which a Model is created from. If you already have a good understanding of Mongoose, then let’s proceed.
Getting Started
To begin, let’s instantiate a new Node.js application. In a command prompt, navigate to where you want to host your Node.js applications and perform the following commands:
mkdir csvimport cd csvimport npm init
I’ve left all the defaults in place, so my application will start with index.js
. Before creating and parsing CSV files, some initial setup needs to be done first. I want to make this a web application; to do that, I’m going to use the Express package to handle all of the nitty-gritty server setup. In your command prompt, install Express by running the following command:
npm install express --save
Since this web application will accept files via a web form, I’m also going to use the Express sub-package Express File Upload. Let’s install that now as well:
npm install express-fileupload --save
I now have done enough initial configuration to set up my web application and create a basic web page that will create my file upload form.
Here is my index.js
file that sets up my web server:
var app = require('express')(); var fileUpload = require('express-fileupload'); var server = require('http').Server(app); app.use(fileUpload()); server.listen(80); app.get('/', function (req, res) { res.sendFile(__dirname + '/index.html'); });
This example imports Express and the Express File Upload libraries, configures my web application to use the File Upload, and listens on port 80. This example has also created a route using Express at “/” which will be the default landing page for my web application. This route returns an index.html
file that contains the web form that will allow a user to upload a CSV file. In my case, I am running on my local computer, so when I visit http://localhost I will see the form that I create in the next example.
Here is my index.html
page that creates my form for uploading a CSV file:
Upload Authors Use the form below to upload a list of authors. Click here for an example template.
This HTML file contains two important things:
- A link to “/template” that, when clicked, will download a CSV template that can be populated with the information to be imported.
- A form with the
encType
set asmultipart/form-data
and an input field with a type offile
that accepts files with a “csv” extension.
As you may have noticed, the HTML makes reference to an Author template. If you read my Introduction to Mongoose article, I created an Author Schema. In this article, I am going to recreate this Schema and allow the user to bulk import a collection of authors into my MongoDB database. Let’s take a look at the Author Schema. Before we do that, though, you probably guessed it—we need to install the Mongoose package:
npm install mongoose --save
Creating the Schema and Model
With Mongoose installed, let’s create a new author.js
file that will define the Author Schema and Model:
var mongoose = require('mongoose'); var authorSchema = mongoose.Schema({ _id: mongoose.Schema.Types.ObjectId, name: { firstName: { type: String, required: true }, lastName: String }, biography: String, twitter: { type: String, validate: { validator: function(text) { if (text !== null && text.length > 0) return text.indexOf('https://twitter.com/') === 0; return true; }, message: 'Twitter handle must start with https://twitter.com/' } }, facebook: { type: String, validate: { validator: function(text) { if (text !== null && text.length > 0) return text.indexOf('https://www.facebook.com/') === 0; return true; }, message: 'Facebook Page must start with https://www.facebook.com/' } }, linkedin: { type: String, validate: { validator: function(text) { if (text !== null && text.length > 0) return text.indexOf('https://www.linkedin.com/') === 0; return true; }, message: 'LinkedIn must start with https://www.linkedin.com/' } }, profilePicture: Buffer, created: { type: Date, default: Date.now } }); var Author = mongoose.model('Author', authorSchema); module.exports = Author;
With the Author Schema and Model created, let’s switch gears and focus on creating the CSV template that can be downloaded by clicking on the template link. To aid with the CSV template generation, I’m going to use the JSON to CSV package. Let’s install that now:
npm install json2csv --save
I’m now going to update my previously created index.js
file to include a new route for “/template”:
var template = require('./template.js'); app.get('/template', template.get);
I’ve only included the new code for the template route that is appended to the previous index.js
file.
The first thing this code does is include a new template.js
file (to be created next) and create a route for “/template”. This route will call a get
function in the template.js
file.
With the Express server updated to include the new route, let’s create the new template.js
file:
var json2csv = require('json2csv'); exports.get = function(req, res) { var fields = [ 'name.firstName', 'name.lastName', 'biography', 'twitter', 'facebook', 'linkedin' ]; var csv = json2csv({ data: '', fields: fields }); res.set("Content-Disposition", "attachment;filename=authors.csv"); res.set("Content-Type", "application/octet-stream"); res.send(csv); };
This file first includes the previously installed json2csv
package. I then create and export a get
function. This function accepts the request and response objects from the Express server.
Inside the function, I’ve created an array of the fields that I want to include in my CSV template. This can be done one of two ways. The first way (that is done in this example) is to create a static list of the fields to be included in the template. The second way is to dynamically create the list of fields by extracting the properties from the Author Schema.
The second way could be done with the following code:
var fields = Object.keys(Author.schema.obj);
I would have liked to use this dynamic method, but it becomes a bit complicated when I don’t want to include multiple properties from the Schema to my CSV template. In this case, my template doesn’t include the _id
and created
properties because these will be populated via code. However, if you don’t have fields you wish to exclude, the dynamic method will work as well.
Creating the CSV Template
With the array of fields defined, I use the json2csv
package to create my CSV template from my JavaScript object. This csv
object will be the results of this route.
And finally, using the res
property from the Express server, I set two header properties that will force the download of an authors.csv
file.
At this point, if you were to run your Node application and navigate to http://localhost in your web browser, the web form would be displayed with a link to download the template. Clicking the link to download the template will allow you to download the authors.csv
file to be populated before it is uploaded.
Here is an example of a populated CSV file:
name.firstName,name.lastName,biography,twitter,facebook,linkedin Jamie,Munro,Jamie is a web developer and author,,, Mike,Wilson,Mike is a web developer and Node.js author,,,
This example, when uploaded, will create two authors: myself and a friend who wrote a book on Node.js a few years back. You may notice that at the end of each line are three commas “,,,”. This is done to abbreviate the example. I have not populated the social network properties (twitter
, facebook
, and linkedin
).
The puzzle pieces are starting to come together and form a picture. Let’s get to the meat and potatoes of this example and parse that CSV file. The index.js
file requires some updating to connect to MongoDB and create a new POST route that will accept the file upload:
var app = require('express')(); var fileUpload = require('express-fileupload'); var mongoose = require('mongoose'); var server = require('http').Server(app); app.use(fileUpload()); server.listen(80); mongoose.connect('mongodb://localhost/csvimport'); app.get('/', function (req, res) { res.sendFile(__dirname + '/index.html'); }); var template = require('./template.js'); app.get('/template', template.get); var upload = require('./upload.js'); app.post('/', upload.post);
With a database connection and a new POST route configured, it’s time to parse the CSV file. Luckily there are several great libraries that assist with this job. I have chosen to use the fast-csv
package that can be installed with the following command:
npm install fast-csv --save
The POST route was created similarly to the template route which invokes a post
function from the upload.js
file. It is not necessary to place these functions in separate files; however, I like to create separate files for these routes as it helps keep the code nice and organized.
Submitting Data
And finally, let’s create the upload.js
file that contains the post
function that is called when the previously created form is submitted:
var csv = require('fast-csv'); var mongoose = require('mongoose'); var Author = require('./author'); exports.post = function (req, res) { if (!req.files) return res.status(400).send('No files were uploaded.'); var authorFile = req.files.file; var authors = []; csv .fromString(authorFile.data.toString(), { headers: true, ignoreEmpty: true }) .on("data", function(data){ data['_id'] = new mongoose.Types.ObjectId(); authors.push(data); }) .on("end", function(){ Author.create(authors, function(err, documents) { if (err) throw err; }); res.send(authors.length + ' authors have been successfully uploaded.'); }); };
Quite a bit is happening in this file. The first three lines include the necessary packages that will be required to parse and save the CSV data.
Next, the post
function is defined and exported for use by the index.js
file. Inside this function is where the magic takes place.
The function first checks that there is a file contained in the request body. If there is not, an error is returned indicating that a file must be uploaded.
When a file has been uploaded, a reference to the file is saved to a variable called authorFile
. This is done by accessing the files
array and the file
property in the array. The file
property matches the name of my file input name that I first defined in the index.html
example.
I’ve also created an authors
array that will be populated as the CSV file is parsed. This array will be used to save the data to the database.
The fast-csv
library is now called by leveraging the fromString
function. This function accepts the CSV file as a string. I’ve extracted the string from the authorFile.data
property. The data
property contains the contents of my uploaded CSV file.
I’ve included two options to the fast-csv
function: headers
and ignoreEmpty
. These are both set to true
. This tells the library that the first line of the CSV file will contain the headers and that empty rows should be ignored.
With the options configured, I have set up two listener functions that are called when the data
event and the end
event are triggered. The data
event is called once for every row of the CSV file. This event contains a JavaScript object of the parsed data.
I update this object to include the _id
property of the Author Schema with a new ObjectId
. This object is then added to the authors
array.
When the CSV file has been fully parsed, the end
event is triggered. Inside the event callback function, I call the create
function on the Author model, passing the array of authors
to it.
If an error occurs trying to save the array, an exception is thrown; otherwise, a success message is displayed to the user indicating how many authors have been uploaded and saved to the database.
If you would like to see the full source code, I’ve created a GitHub repository with the code.
Conclusion
In my example, I’ve only uploaded a couple of records. If your use case requires the ability to upload thousands of records, it might be a good idea to save the records in smaller chunks.
This can be done several ways. If I were to implement it, I would suggest updating the data
callback function to check the length of the authors array. When the array exceeds your defined length, e.g. 100, call the Author.create
function on the array, and then reset the array to empty. This will then save the records in chunks of 100. Be sure to leave the final create
call in the end
callback function to save the final records.
Enjoy!
Powered by WPeMatico