CakeDC Blog

TIPS, INSIGHTS AND THE LATEST FROM THE EXPERTS BEHIND CAKEPHP

File uploading, file storage and CakePHPs MediaView class

This article includes how to upload and store files, because I've seen a lot of discussion about that too, but if you're just interested in how to use the MediaView class scroll down.

Handling file uploads in CakePHP

First let's start with the required form, to create a file upload form all you have to do is this:

echo $form->create('Media', array('action' => 'upload', 'type' => 'file'));
echo $form->file('file');
echo $form->submit(__('Upload', true));

 

The "type" in the options of Form::create() takes post, get or file. To configure the form for file uploading it has to be set to file which will render the form as a multipart/form-data form.

When you submit the form now, you'll get data like this in $this->data of your controller:

Array
(
	[Media] => Array
	(
		[file] => Array
		(
			[name] => cake.jpg
			[type] => image/jpeg
			[tmp_name] => /tmp/hp1083.tmp
			[error] => 0
			[size] => 24530
		)
	)
)

Ok, now the big question with a simple answer is where the file data should be processed, guess where. Right – in the model because it's data to deal with and validation to do against it. Because it's a recurring task to upload files I suggest you to write a behaviour for it or convert your existing component to a behaviour.

If you keep it generic you can extend it with a CsvUpload, VideoUpload or ImageUpload behaviour to process the file directly after its upload or do special stuff with it, like resizing the image or parsing the csv file and store its data in a (associated) model.

We're not going to show you our own code here for obvious reasons, but I'll give you a few hints what you can or should do inside of the behavior:

  1. Validate the uploaded field, the field itself contains already an error code if something was wrong with the upload. Here is a link to the php manual page that shows you the list of the errors that you can get from the form data. http://www.php.net/manual/en/features.file-upload.errors.php
  2. Validate the uploaded file, is it really the kind of file you want and does it really contain the data structure you want?
  3. Check if the target destination of the file is writeable, create directories, whatever is needed and error handling for it, I suggest you to use CakePHP's File and Folder classes for that.
  4. Add a callback like beforeFileSave() and afterFileSave() to allow possible extending behaviors to use them.

Database vs file system storage

Feel free to skip that part if you already store the files in the file system.

Storing files in the database is in nearly all cases a bad solution because when you get the file it has to go its way through the database connection, which can, specially on servers that are not in the same network, cause performance problems.

Advantages of storage in the file system:

  1. Easy and direct file access, to parse them (csv, xml...) or manipulate them (images)
  2. You don't need to install any additional software to manage them
  3. Easy to move and mount on other machines
  4. Smaller then stored in a DB

The suggested solution is to store meta data of the file like size, hash, maybe path and other related info in a DB table and save the file in the file system.

Some people come up with the security and want to store a file because of that in the database which is wrong. You should not store the file in a public accessible directory like the webroot of the application. Store it in another location like APP/media. You control the access to the file by checking the permissions against the DB records of your meta data and sending it by using the CakePHP MediaView class, I'll explain later how to use it.

I don't say that storage of files inside the DB is in general a bad idea but for web based applications it is in nearly every case a bad idea.

File system Performance

A bottleneck in the long run on every file system is a large amount of files in a single directory. Imagine just 10.000 users and each has an individual avatar image. Further ext3 for example is limited to 32000 sub folders, other file systems have maybe similar restrictions. You can find a list of file system limitations here: http://en.wikipedia.org/wiki/Comparison_of_file_systems#Limits

To avoid performance problems caused by that you should store your files in a pseudo-random directory structure like APP/media/32/a5/3n/. This will also allow you to easily mount some of the semi-random created directories on another machine in the case you run out of disk space.

/**
 * Builds a semi random path based on the id to avoid having thousands of files
 * or directories in one directory. This would result in a slowdown on most file systems.
 *
 * Works up to 5 level deep
 *
 * @see http://en.wikipedia.org/wiki/Comparison_of_file_systems#Limits
 * @param mixed $string
 * @param integer $level
 * @return mixed
 * @access protected
 */
	protected function _randomPath($string, $level = 3) {
		if (!$string) {
			throw new Exception(__('First argument is not a string!', true));
		}

		$string = crc32($string);
		$decrement = 0;
		$path = null;
		
		for ($i = 0; $i < $level; $i++) {
			$decrement = $decrement -2;
			$path .= sprintf("%02d" . DS, substr('000000' . $string, $decrement, 2));
		}

		return $path;
	}

You should also know that php running in safe mode does not allow you to create more then one directory deep in one call. You have to take this in consideration, the above function does not cover that because safe mode is basically deprecated and will be also removed in php6

Sending a file to the client – or the unknown MediaView class

From what I've seen in the ruins of outsourced projects that asked us for rescue and also in the CakePHP googlegroup I think not many people are aware that CakePHP has a view that is thought to be used for downloads and display (images, text...) of files. It's called the MediaView class.

I'll now explain you how to use this class to send files to the client.

/**
 * Sends a file to the client
 *
 * @param string $id UUID
 * @access public
 */
	public function download($id = null) {
		$this->Media->recursive = -1;
		$media = $this->Media->read(null, $id);

		if (empty($media)) {
		$this->redirect('/', 404, true);
		}
		
		$this->set('cache', '3 days');
		$this->set('download', true);
		$this->set('name', $media['Media']['slug']);
		$this->set('id', $media['Media']['filename']);
		$this->set('path', APP . 'media' . DS . $media['Media']['path']);
		$this->set('modified', $media['Media']['modified']);
		$this->set('mimeType', $media['Media']['mime_type']);
		$this->set('extension', $media['Media']['extension']);

		$this->view = 'Media';
		$this->autoLayout = false;
		if ($this->render() !== false) {
			$this->Media->updateAll(
				array('Media.downloads' => 'Media.downloads + 1'),
				array('Media.id' => $id));
		}
	}

You simply have to set autoLayout to false and the view class to media.

$this->view = 'Media';
$this->autoLayout = false;

There are a few view variables to set to “configure” the file download or display. To control if you want to make the client downloading the file or to display it, in the case of images for example, you simply set 'download' to true or false;

	$this->set('download', true);

You can control the browser caching of the file by setting cache. Please not that you do not have to use caching if download is set to true! Downloads do not need caching.

	$this->set('cache', '3 days');

The next part might be a little confusing, you have “id” and “name”. Id is the actual file on your server you want to send while name is the filename under which you want to send the file to the client. “path” is the path to the file on the server.

	$this->set('name', $media['Media']['slug']);
$this->set('id', $media['Media']['filename']);
$this->set('path', APP . 'media' . DS . $media['Media']['path']);

If you want to send a mime type that does not already in the MediaView class you can set it.

	$this->set('mimeType', $media['Media']['mime_type']);

If you don't set it, the class will try to determine the mime type by the extension.

	$this->set('extension', $media['Media']['extension']);

Note that you have to set the extension to make it work and that the extension is attached to the filename! If you store the filename with an extension you have to break it up.

When everything is set you can check if render() was successfully and do whatever you want after that, for example count the download.

	if ($this->render() !== false) {
	$this->Media->updateAll(
	array('Media.downloads' => 'Media.downloads + 1'),
	array('Media.id' => $id));
}

 

Closing words

I hope you enjoyed reading the article and it helped you improving your knowledge about CakePHP. Feel free to ask further questions by using the comment functionality. Have fun coding!

Latest articles

Using a vagrant box as quick environment for the Getting Started with...

We've decided to create a simple vagrant box with all the required packages to improve the environment setup step in our free Getting Started with CakePHP training session. We used other tools in the past, but we hope vagrant will help users to install a common environment before the session to get the most of it.

Requirements

Setup

  • Create a new folder where the code will be located
  • Create a new file called Vagrantfile with the following contents
# -*- mode: ruby -*- # vi: set ft=ruby : Vagrant.configure("2") do |config| config.vm.box = "cakedc/cakephp-training" config.vm.network :forwarded_port, guest: 8765, host: 8765 config.vm.network :private_network, ip: "192.168.33.33" config.vm.provider "virtualbox" do |vb| vb.memory = "1024" vb.customize ['modifyvm', :id, '--cableconnected1', 'on'] end end
  • Run vagrant up
  • Wait (download could take several minutes depending on your internet connection)
  • Run vagrant ssh
Now you have ssh access to a training ubuntu (16.04) based virtual machine, with all the requirements to run your training CakePHP application.
  • Setup a new CakePHP project
cd /vagrant composer create-project cakephp/app
  • Start the local server
cd /vagrant/app php bin/cake.php server --host 0.0.0.0
  • From your host machine, open a browser and navigate to http://localhost:8765
  • You should be able to see the CakePHP welcome page
  We think this VM will enable faster environment setups, and an easier entry point to the training session. Please let us know if you find issues with this process.

Boosting your API with CakePHP API and PHP-PM (ReactPHP)

A couple days ago AlexMax commented in CakePHP's IRC channel about the https://github.com/php-pm/php-pm project and it rang a bell for us. We did a couple tests internally and found this could be a great companion to our API plugin, so we wrote a new Bridge for CakePHP and ran some benchmarks.

The Cast

We put all together and created a sample application (1 posts table with 30 records) to do some benchmarks.

Benchmark configuration

We are not aiming to provide detailed or production figures, just a reference of the results obtained for your comparison. Results are generated from a development box, using PHP 7.1.12-3+ubuntu16.04.1+deb.sury.org+1 with xdebug enabled on ubuntu xenial, 8x Intel(R) Core(TM) i7-4771 CPU @ 3.50GHz We baked the application using the latest CakePHP 3.5.10, and set application debug to false, and log output to syslog. As we are interested in boosting API response times the most, we tested the following scenarios
  • A) CakePHP json output, served from nginx+phpfpm
  • B) CakePHP + API Plugin Middleware integration json output, served from nginx+phpfpm
  • C) CakePHP + API Plugin Middleware integration json output, served from php-pm
Benchmark figures were obtained using ab -n 5000 -c 100 URL

Results

Scenario requests/second avg time
A) CakePHP json output, served from nginx+phpfpm 372.97 [#/sec] (mean) 268.120 [ms] (mean)
B) CakePHP + API Plugin Middleware integration json output, served from nginx+phpfpm 399.79 [#/sec] (mean) 250.133 [ms] (mean)
C) CakePHP + API Plugin Middleware integration json output, served from php-pm 911.95 [#/sec] (mean) 109.656 [ms] (mean)
  These results for a NOT OPTIMIZED CakePHP application are promising, and the improvement using PHP-PM is huge in this case. There are some important considerations though:
  • PHP-FPM is mature and stable, PHP-PM is still in early development, although there is a 1.0 version released already.
  • Processes need monitoring, specially regarding memory leaks, we would need to manage a restart policy and be able to hot-restart individual workers
  • System integration, init scripts are not provided, even if this is something easy to manage nowadays via systemd or monit, would be good to have for production
  • Application bootstrapping should not be affected by the request. If your application bootstrapping depends on the request params, or logged in user, you'll need to refactor your code
  • Session handling was not tested, issues are reported for PHP-PM for other frameworks. We were aiming to stateless API's so we don't know if this would be an issue for a regular application
Performance is always a concern for the API developer, applying proven paradigms like the event driven development (https://reactphp.org/) to your existing code would be the way to go and ensure backend frameworks like CakePHP will perform as required when dealing with the peaks we all love and hate.

Giving back to the community

This Plugin's development has been sponsored by the Cake Development Corporation. Contact us if you are interested in:      

Why an independent code review is important

Passbolt recently contacted us about doing a code review so we thought now would be a great time to share more about our code review process with you. While in-house and peer reviews are important to maximise code quality, it is still incredibly important to get an independent third party to review your code - that is where CakeDC can step in. Passbolt is free, open-source, self hosted password manager for teams which makes collaboration and sharing company account credentials within a team much easier. It's based on open security standards and uses OpenPGP to authenticate users and verify secrets server side. Passbolt consists of server side web app built in CakePHP providing web interface and API, and Chrome extension for client side. The overall aspects that are reviewed in our code review include a review of quality, implementation, security, performance, documentation and test coverage. When looking into quality, the team reviews aspects concerning the code following CakePHP conventions, coding standards and coding quality. Overall, passbolt’s code review revealed that CakePHP conventions and coding standards are largely followed, no concerns were detected. Implementation outlines key issues with framework use and approach. It includes reviewing the code for framework usage, separation of concerns as well as code reuse and modularity. Key recommendations are outlined at this point and guidance is given into how to solve any issues. For the Passbolt review, bigger or concerning issues were uncovered, but improvements were recommended and outlined within the closing documentation. The security portion of the code review deals with how secure the code is in terms of CakePHP usage. No security flaws were found in the passbolt code review. Our in depth code review focuses on performance, specifically investigating any bottlenecks in the code base and database as well as indexes optimization. For the full passbolt code review results, check out the Code review results. Passbolt has also posted about their review, check out their post here. If you or your company has a CakePHP application and you aren’t sure if its running at the optimum, then get in touch - Code reviews can offer insights and learning into how to improve your application.

We Bake with CakePHP