CakeDC Blog

TIPS, INSIGHTS AND THE LATEST FROM THE EXPERTS BEHIND CAKEPHP

File uploading, file storage and CakePHPs MediaView class

This article includes how to upload and store files, because I've seen a lot of discussion about that too, but if you're just interested in how to use the MediaView class scroll down.

Handling file uploads in CakePHP

First let's start with the required form, to create a file upload form all you have to do is this:

echo $form->create('Media', array('action' => 'upload', 'type' => 'file'));
echo $form->file('file');
echo $form->submit(__('Upload', true));

 

The "type" in the options of Form::create() takes post, get or file. To configure the form for file uploading it has to be set to file which will render the form as a multipart/form-data form.

When you submit the form now, you'll get data like this in $this->data of your controller:

Array
(
	[Media] => Array
	(
		[file] => Array
		(
			[name] => cake.jpg
			[type] => image/jpeg
			[tmp_name] => /tmp/hp1083.tmp
			[error] => 0
			[size] => 24530
		)
	)
)

Ok, now the big question with a simple answer is where the file data should be processed, guess where. Right – in the model because it's data to deal with and validation to do against it. Because it's a recurring task to upload files I suggest you to write a behaviour for it or convert your existing component to a behaviour.

If you keep it generic you can extend it with a CsvUpload, VideoUpload or ImageUpload behaviour to process the file directly after its upload or do special stuff with it, like resizing the image or parsing the csv file and store its data in a (associated) model.

We're not going to show you our own code here for obvious reasons, but I'll give you a few hints what you can or should do inside of the behavior:

  1. Validate the uploaded field, the field itself contains already an error code if something was wrong with the upload. Here is a link to the php manual page that shows you the list of the errors that you can get from the form data. http://www.php.net/manual/en/features.file-upload.errors.php
  2. Validate the uploaded file, is it really the kind of file you want and does it really contain the data structure you want?
  3. Check if the target destination of the file is writeable, create directories, whatever is needed and error handling for it, I suggest you to use CakePHP's File and Folder classes for that.
  4. Add a callback like beforeFileSave() and afterFileSave() to allow possible extending behaviors to use them.

Database vs file system storage

Feel free to skip that part if you already store the files in the file system.

Storing files in the database is in nearly all cases a bad solution because when you get the file it has to go its way through the database connection, which can, specially on servers that are not in the same network, cause performance problems.

Advantages of storage in the file system:

  1. Easy and direct file access, to parse them (csv, xml...) or manipulate them (images)
  2. You don't need to install any additional software to manage them
  3. Easy to move and mount on other machines
  4. Smaller then stored in a DB

The suggested solution is to store meta data of the file like size, hash, maybe path and other related info in a DB table and save the file in the file system.

Some people come up with the security and want to store a file because of that in the database which is wrong. You should not store the file in a public accessible directory like the webroot of the application. Store it in another location like APP/media. You control the access to the file by checking the permissions against the DB records of your meta data and sending it by using the CakePHP MediaView class, I'll explain later how to use it.

I don't say that storage of files inside the DB is in general a bad idea but for web based applications it is in nearly every case a bad idea.

File system Performance

A bottleneck in the long run on every file system is a large amount of files in a single directory. Imagine just 10.000 users and each has an individual avatar image. Further ext3 for example is limited to 32000 sub folders, other file systems have maybe similar restrictions. You can find a list of file system limitations here: http://en.wikipedia.org/wiki/Comparison_of_file_systems#Limits

To avoid performance problems caused by that you should store your files in a pseudo-random directory structure like APP/media/32/a5/3n/. This will also allow you to easily mount some of the semi-random created directories on another machine in the case you run out of disk space.

/**
 * Builds a semi random path based on the id to avoid having thousands of files
 * or directories in one directory. This would result in a slowdown on most file systems.
 *
 * Works up to 5 level deep
 *
 * @see http://en.wikipedia.org/wiki/Comparison_of_file_systems#Limits
 * @param mixed $string
 * @param integer $level
 * @return mixed
 * @access protected
 */
	protected function _randomPath($string, $level = 3) {
		if (!$string) {
			throw new Exception(__('First argument is not a string!', true));
		}

		$string = crc32($string);
		$decrement = 0;
		$path = null;
		
		for ($i = 0; $i < $level; $i++) {
			$decrement = $decrement -2;
			$path .= sprintf("%02d" . DS, substr('000000' . $string, $decrement, 2));
		}

		return $path;
	}

You should also know that php running in safe mode does not allow you to create more then one directory deep in one call. You have to take this in consideration, the above function does not cover that because safe mode is basically deprecated and will be also removed in php6

Sending a file to the client – or the unknown MediaView class

From what I've seen in the ruins of outsourced projects that asked us for rescue and also in the CakePHP googlegroup I think not many people are aware that CakePHP has a view that is thought to be used for downloads and display (images, text...) of files. It's called the MediaView class.

I'll now explain you how to use this class to send files to the client.

/**
 * Sends a file to the client
 *
 * @param string $id UUID
 * @access public
 */
	public function download($id = null) {
		$this->Media->recursive = -1;
		$media = $this->Media->read(null, $id);

		if (empty($media)) {
		$this->redirect('/', 404, true);
		}
		
		$this->set('cache', '3 days');
		$this->set('download', true);
		$this->set('name', $media['Media']['slug']);
		$this->set('id', $media['Media']['filename']);
		$this->set('path', APP . 'media' . DS . $media['Media']['path']);
		$this->set('modified', $media['Media']['modified']);
		$this->set('mimeType', $media['Media']['mime_type']);
		$this->set('extension', $media['Media']['extension']);

		$this->view = 'Media';
		$this->autoLayout = false;
		if ($this->render() !== false) {
			$this->Media->updateAll(
				array('Media.downloads' => 'Media.downloads + 1'),
				array('Media.id' => $id));
		}
	}

You simply have to set autoLayout to false and the view class to media.

$this->view = 'Media';
$this->autoLayout = false;

There are a few view variables to set to “configure” the file download or display. To control if you want to make the client downloading the file or to display it, in the case of images for example, you simply set 'download' to true or false;

	$this->set('download', true);

You can control the browser caching of the file by setting cache. Please not that you do not have to use caching if download is set to true! Downloads do not need caching.

	$this->set('cache', '3 days');

The next part might be a little confusing, you have “id” and “name”. Id is the actual file on your server you want to send while name is the filename under which you want to send the file to the client. “path” is the path to the file on the server.

	$this->set('name', $media['Media']['slug']);
$this->set('id', $media['Media']['filename']);
$this->set('path', APP . 'media' . DS . $media['Media']['path']);

If you want to send a mime type that does not already in the MediaView class you can set it.

	$this->set('mimeType', $media['Media']['mime_type']);

If you don't set it, the class will try to determine the mime type by the extension.

	$this->set('extension', $media['Media']['extension']);

Note that you have to set the extension to make it work and that the extension is attached to the filename! If you store the filename with an extension you have to break it up.

When everything is set you can check if render() was successfully and do whatever you want after that, for example count the download.

	if ($this->render() !== false) {
	$this->Media->updateAll(
	array('Media.downloads' => 'Media.downloads + 1'),
	array('Media.id' => $id));
}

 

Closing words

I hope you enjoyed reading the article and it helped you improving your knowledge about CakePHP. Feel free to ask further questions by using the comment functionality. Have fun coding!

Latest articles

Why Database Compression?

Nowadays people are not concerned about how large their database is in terms of MB. Storage is cheap. Even getting cheap SSD storage is not a big deal.    However, this is true if we are talking about hundreds of MB or even several GB, but sometimes we get into a situation where we have massive amounts of data (i.e Several tables with lots of longtext columns). At this point it becomes a concern because we need to increase the hard disk size, and find ourselves checking to see  if the hard disk is full several times per day or week, etc.   Now, if you have faced a situation like this before, it's time to talk about database compression. Compression is a technique, developed theoretically back in the 1940s but actually implemented in the 1970s. For this post we will focus on MySQL compression, which is performed using the open-source ZLib library. This library implements the LZ77 dictionary-based compression algorithm.   Before going into MySQL compression details, lets name some of the main DBMS and their compression techniques:

  • MySQL: ZLib (LZ77) [1]
  • Oracle: Oracle Advanced Compression (Proprietary)[2]
  • Postgres: PGLZ or LZ4 (if added this option at compilation level) [3]
  • DB2: Fixed-length compression or Huffman in some systems [4]
  So, now that we know this useless information, lets learn how to implement this in MySQL.   Firstly, you need to know that you CAN'T enable compression if:
  • Your table lives into `system` tablespace, or
  • Your tablespace was created with the option `innodb_file_per_table` disabled.
  It is important to test if the compression is the best solution for you.  If you have a table with a lot of small columns, you will probably end up with a larger-size table after "compressing" because of the headers and compression information. Compression is always great when you have longtext columns which can be heavily compressed.   Then, to enable compression for a table, you just need to include the following option when your table is created, or execute it as part of an alter statement: ROW_FORMAT=COMPRESSED These are the basics but you may find more useful information in MySQL manual.   You can also take a look at Percona which implements a Column level compression. This is interesting if you have a table with a lot of small fields and one large column, or if you have to optimize your database as much as possible. [6]   Finally, just say that even that storage is cheaper than ever, the amount of information has increased as well and we are now using and processing an incredible amount of data... so it looks like compression will always be a requirement.   I hope you find this information useful and please let me know if you have any questions or suggestions below in the comments section.

  [1]:https://dev.mysql.com/doc/internals/en/zlib-directory.html  [2]:https://www.oracle.com/technetwork/database/options/compression/advanced-compression-wp-12c-1896128.pdf  [3]:https://www.postgresql.org/docs/devel/runtime-config-client.html  [4]:https://www.ibm.com/docs/en/db2-for-zos/12?topic=performance-compressing-your-data  [6]:https://www.percona.com/doc/percona-server/8.0/flexibility/compressed_columns.html

Migrate CsrfComponent to CsrfProtectionMiddleware

The CsrfComponent was deprecated since CakePHP version 3.5.0. On CakePHP 4, we now have a new middleware to help us protect applications against Cross Site Request Forgery attacks. In this article, we are going to show the different ways to enable and disable Cross Site Request Forgery between the controller and the new middleware.  

Enable CSRF

Do these changes:
  • In your Application::middleware add $middlewareQueue->add(new CsrfProtectionMiddleware());
  • Remove $this->loadComponent('Csrf') from your controllers.
The configuration keys from CsrfComponent cookieName, expiry, secure and field are also available in the middleware. If you used any of these, you should be able to continue using the middleware.  

Disable CSRF

Is not recommended to disable CSRF, but sometimes you really need to. With the component you could have something like this in your controller:   Now with the middleware, we can use the method skipCheckCallback to disable Csrf based on a custom logic:     That’s it, we have migrated CSRF protection from CsrfComponent to CsrfProtectionMiddleware.  

CakePHP Upgrade to 4 - Piece by Piece

Let's imagine you have a huge application in CakePHP 2.x (or 1.x) and you're planning to upgrade to the latest CakePHP 4.x. After doing some estimations, you realize the upgrade process is out of your scope, because you don't have the budget or developer availability to do it in 1 shot. At this point, some companies would abort the upgrade and keep working on 2.x for "some more time" until "this last release is delivered" or until "budget is available next fall", digging deeper and deeper into the rabbit hole…   There's an alternative you could follow if this is your case: proceed with the upgrade of a smaller portion of your application and let the 2 versions coexist for some time.   Warning: This is NOT for every project or company. Please carefully think about this decision as it has overhead you'll need to handle.   So, if your application has a portion that could be extracted, with a small set of dependencies from other areas of your application, or if you are creating a new feature with a limited set of dependencies with the rest of your application, this approach would be good for you.   In order to allow both applications to coexist, we are going to keep the CakePHP 1.x application as the main one, and use CakePHP 4.x as a subfolder inside of the first one. It's important to note that in order to share sessions between both applications you'll need to use a storage you can actually share, like database or cache based sessions (redis, etc). Then, you can use a configuration like this one (see below) to add a new upstream to handle your new application. Note: the upstream could be located in another server of your network, using a different PHP version etc.   We've used nginx as an example, but you can use the same approach in other web servers like Apache.   In our example we're going to use all paths starting with /api  to be managed by our new CakePHP 4.x application. upstream cake4 {      # Note this could be any server/port in your network where the cake4 application is installed          server 127.0.0.1:9090; }   # This is our CakePHP 2.x server server {     server_name example.com;       root   /var/virtual/example.com/app/webroot;     index index.php;       # All requests /api are forwarded to our CakePHP 4.x application location /api {         proxy_pass http://cake4;             proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;         proxy_set_header Host $host;             proxy_http_version 1.1;         proxy_set_header Upgrade $http_upgrade;             proxy_set_header Connection "Upgrade";     }       location / {             try_files $uri $uri/ /index.php?$args;     }       location ~ \.php$ {           try_files $uri =404;           include fastcgi_params;                fastcgi_pass unix:/run/php/php7.4-fpm.sock;           fastcgi_index index.php;             fastcgi_intercept_errors on;         fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;     } }   # This is our CakePHP 4.x server server {     listen 9090;     server_name example.com;       root   /var/virtual/cake4-example.com/webroot;     index index.php;       location / {         try_files $uri $uri/ /index.php?$args;     }       location ~ \.php$ {         try_files $uri =404;             include fastcgi_params;         fastcgi_pass unix:/run/php/php7.4-fpm.sock;             fastcgi_index index.php;         fastcgi_intercept_errors on;             fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;     } }   As you can see, we have 3 blocks defined in our configuration file:

  • upstream cake4 {...} to forward requests to the CakePHP 4.x application
  • server {... 2.x ...} using a location /api to forward all these calls to the CakePHP 4.x server
  • server {... 4.x ...} using a specific port (9090) to handle requests in CakePHP 4.x
  Using this approach, you can break your application into 2 parts, and start moving features by path to CakePHP 4. You'll need to handle the changes in 2 projects for a while, and pay this overhead,  but this could be better to maintain than a CakePHP 2.x application sitting on an old PHP version. Happy baking!  

We Bake with CakePHP