Uploading Files To Mongo DB Without Express
Building functionality to upload a file to a Node.js server using
express is a
piece of cake. But for various reasons sometimes we do not want to use express. I
had to implement such a functionality for such a system which only uses
pure Node.js. Here is my experience while at it.
HTTP multipart request
Http is a text based protocol. It is intended to transfer text. If we
transfer files which may contain binary patterns that are not found in
simple text files, the network components, as they are only intended to
handle text, may misbehave. The data in the http packet could contain a
byte with a pattern that is used as a control signal in the http
protocol. For example the end of transmission(EOT) character. Some
components may reject bytes that are not valid text. Some may edit them.
These may corrupt the file.
To avoid such pitfalls the standard of http multipart request is used.
Http multipart request body is a little different in format to its
regular counterpart. Most notably the value of the content type header
field would be 'multipart/form-data'. The body of the http request could
contain multiple files separated by a boundary. Network components are
designed so that they would interpret multipart requests differently
than regular ones. Data amid boundaries are treated as binary and they
would not care what they mean.
So when we upload a file to a server through the internet what we
actually do is no different than what we do when we submit a form by an
http post request. Except that the http post request is encoded in a
different way.
However above information is not needed to be known by the application
programmer because the user agent she is writing the program to, should
know how to put together an http multipart request. For example the
browser (a user agent) would submit a multiparty request at the
submission of following html form.
<form action="/upload" enctype="multipart/form-data" method="post">
<input type="text" name="title"><br>
<input type="file" name="upload" multiple="multiple"><br>
<input type="submit" value="Upload">
</form>
Or on the Linux terminal
curl -v -include --form file=@my_image.png http://localhost:3000/upload
Server side
Just as the http client the application programmer is using would encode
an http multiparty request, the server side framework should decode one
for her. As mentioned earlier express would do this without a hassle.
But if express is not an option for you, if you are on
pure Node.js, then you might be a little confused. I was too
until I got to know about
multiparty. This npm package takes in the request instance and gives you
references to the files saved in your disk on the temp directory, the
files that were included in the request. Just as express would have.
http.createServer(function(req, res) {
var multiparty = require('multiparty');
if (req.url === '/upload' && req.method === 'POST') {
// parse a file upload
var form = new multiparty.Form();
form.parse(req, function(err, fields, files) {
res.writeHead(200, {'content-type': 'text/plain'});
response.end("File uploaded successfully!");
// 'files' array would contain the files in the request
});
return;
}
}).listen(8080);
In the callback of the form.parse method it is possible to read the file
in and save it to a database, rename it (move it) or do any other
processing.
Processing the request
But if we are gonna save the file on the mongodb database why save it in
the disk? Turns out we don't have to.
The form instant created by multiparty's Form constructor has 'part' and
'close' events to which handlers can be hooked. The 'part' event will be
triggered once for each file(
part) included in the multi
part request. 'close' will be triggered once all the files are read.
The handler of the 'part' event will be passed an instance of
a Node.js
ReadableStream, just like a request instance to an Node.js http server. So it has
'data' and 'close' events (among others) just like a request instance to
an Node.js http server, that can be used to read in the file,
chunk by chunk.
form.on('part', function(part) {
console.log('got file named ' + part.name);
var data = '';
part.setEncoding('binary'); //read as binary
part.on('data', function(d){ data = data + d; });
part.on('end', function(){
//data variable has the file now. It can be saved in the mongodb database.
});
});
The handler of the 'close' can be used to respond to the client.
form.on('close', function() {
res.writeHead(200, {'content-type': 'text/plain'});
response.end("File uploaded successfully!");
});
The complete code would look like this.
var multiparty = require('multiparty');
var form = new multiparty.Form();
var attachments = []
form.on('part', function(part) {
var bufs = [];
if (!part.filename) { //not a file but a field
console.log('got field named ' + part.name);
part.resume();
}
if (part.filename) {
console.log('got file named ' + part.name);
var data = "";
part.setEncoding('binary'); //read as binary
part.on('data', function(d){ data = data + d; });
part.on('end', function(){
//data variable has the file now. It can be saved in the mongodb database.
});
}
});
form.on('close', function() {
response.writeHead(200);
response.end("File uploaded successfully!");
});
form.parse(request);
Multiparty would save the files to the disk, only if the form.parse
method is provided a callback. So in the above case it would not do so.
It is expected that processing of the file is handled using the event
handlers of the form instance.
Saving on MongoDb
Saving the data on the mongodb database could be done using the
GridStore. This part will not be included in this post since it is straight
forward. Further this step will be the same whether we use express or
not, and I want this post to be specific to the case of pure Node.js.
Thanks for checking out!