In this post I will be collecting Roots Trellis local and remote provisioning, deployment and general setup errors with solutions I came up with. The list is growing and growing so make sure you press control or command f to search for your specific error or use the search box.

Local Setup Errors

I have had quite a few errors setting up my local box and I will gather them here for future reference.

Composer not installed properly

Sometimes vagrant provisioning just hangs on the installation of dependencies and more specifically Composer for PHP dependency management. When I then try to load the local site after the Ansible playbook stopped there I get to see an NGINX 403 access denied error locally.

 TASK: [wordpress-install | Install Dependencies with Composer] **************** 
failed: [default] => (item={'key': 'domain.com', 'value': {'site_install': True, 'permalink_structure': '/%postname%/', 'admin_user': 'admin', 'local_path': '../site', 'cache': {'duration': '30s', 'enabled': False}, 'ssl': {'enabled': False}, 'multisite': {'enabled': False, 'subdomains': False}, 'site_title': 'Example Site', 'admin_password': 'admin', 'env': {'db_name': 'domain_dev', 'db_user': 'imagewize_dbuser', 'wp_env': 'development', 'db_password': 'domain_dbpassword', 'disable_wp_cron': True, 'wp_home': 'http://domain.dev', 'wp_siteurl': 'http://domain.dev/wp'}, 'site_hosts': ['domain.dev'], 'admin_email': 'admin@domain.dev'}}) => {"changed": true, "cmd": ["composer", "install"], "delta": "0:00:00.062962", "end": "2015-12-29 10:11:13.064551", "item": {"key": "imagewize.com", "value": {"admin_email": "admin@domain.dev", "admin_password": "admin", "admin_user": "admin", "cache": {"duration": "30s", "enabled": false}, "env": {"db_name": "domain_dev", "db_password": "domain_dbpassword", "db_user": "domain_dbuser", "disable_wp_cron": true, "wp_env": "development", "wp_home": "http://domain.dev", "wp_siteurl": "http://domain.dev/wp"}, "local_path": "../site", "multisite": {"enabled": false, "subdomains": false}, "permalink_structure": "/%postname%/", "site_hosts": ["domain.dev"], "site_install": true, "site_title": "Example Site", "ssl": {"enabled": false}}}, "rc": 1, "start": "2015-12-29 10:11:13.001589", "stdout_lines": [], "warnings": []}
stderr: You are running composer with xdebug enabled. This has a major impact on runtime performance. See https://getcomposer.org/xdebug
Composer could not find a composer.json file in /srv/www/domain.com/current
To initialize a project, please create a composer.json file as described in the https://getcomposer.org/ "Getting Started" section

Best solution I found is

vagrant destroy

and

vagrant up

Just a

vagrant provision

did not do the trick

Package cannot be authenticated

Local setup errors setting up my Vagrant Box I seem to get often when I do not use my VPN are:

The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!
apt-get install -y bindfs
Stdout from the command:
Reading package lists...
Reading state information...
The following NEW packages will be installed:
bindfs
0 upgraded, 1 newly installed, 0 to remove and 1 not upgraded.
Need to get 25.1 kB of archives.
After this operation, 89.1 kB of additional disk space will be used.
WARNING: The following packages cannot be authenticated!
bindfs
Stderr from the command:
E: There are problems and -y was used without --force-yes

The package cannot be authenticated somehow. No idea why really, but seems to happen to me out here in Bahrain. Solution is using another network or a VPN and running

vagrant provision

to set all up well locally.

Unexpected Exception: TaskInclude

Installation goes well, but just before all is done you get

ERROR! Unexpected Exception: 'TaskInclude' object has no attribute 'has_triggered'
Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again.

No solution yet either, besides destroying the box and starting from scratch. Googling did not dig up any dirt either. Well, one sort of related thread here, but has done me no good. Apparently the Ansible 2.1.0 version has a regression so 2.0.2 needed to be installed. See Roots Discourse thread here. Could not do that using Homebrew so had to remove Ansible, and then install it using Python’s Pip. There I had issues getting it to install at all due to OSX Mavericks and El Capitan issues. Found a solution for that here. Also needed a Python setuptools upgrade to avoid other errors  such as:

ERROR! Unexpected Exception: (setuptools 1.1.6 (/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python), Requirement.parse('setuptools>=11.3'))

SO thread here.

Dict Object has no Attribute

Setting up or provisioning the local server / Vagrant box I ran into this error as well:

TASK: [wordpress-setup | Create/assign database user to db and grant permissions] *** 
fatal: [default] => One or more undefined variables: 'dict object' has no attribute 'domain.com'Provisioning Errors

Check if vault.yml has the correct site name.

Unable to find Inventory File

ERROR: Unable to find an inventory file, specify one with -i ?

This error showed up  when I want to provision a server I did provisioning on before using a different project and I forgot about. Better to start out with an empty instance. Command leading to this error is:

ansible-playbook server.yml -e env=staging

Solution was wiping out Trellis and start from scratch and then provision again

502 Bad Gateway

After provisioning a local vagrant box I ran into this error:

502 Bad Gateway
nginx

Suspending the Vagrant box and restarting did not help. Checking the logs you may see something like:

*8 connect() to unix:/var/run/php-fpm-wordpress.sock failed (2: No such file or directory) while connecting to upstream, client: 192.168.50.1, server: domain.dev, request: "GET /favicon.ico HTTP/1.1", upstream: "fastcgi://unix:/var/run/php-fpm-wordpress.sock:", host: "domain.dev", referrer: "http://domain.dev/"

Meaning there was an issue with Fast CGI. The path to the socket should be

/var/run/php/php7.0-fpm.sock

So I opened

/etc/php/7.0/fpm/pool.d/wordpress.conf

and adjusted it. Then restarted the server:

sudo service nginx restart

No joy. Still was loaded from another location it seems though I could not locate it inside php.ini. So I destroyed the box:

vagrant destroy

and

 vagrant up

and re-created it. And then all was well again. If I do bump into this issue again and manage to solve it manually I will let you guys know.

sSMTP Installation Error

Another provisioning error I seem to get when I work with multiple domain names connected to one elastic IP is an issue installing sSMTP

 failed: [xx.xxx.xx.xx] => {"failed": true}
 stderr: hostname: Name or service not known
 dpkg: error processing package ssmtp (--configure):
 subprocess installed post-installation script returned error exit status 1
 Errors were encountered while processing:
 ssmtp
 E: Sub-process /usr/bin/dpkg returned an error code (1)
stdout: Reading package lists...
 Building dependency tree...
 Reading state information...
 The following NEW packages will be installed:
 ssmtp
 0 upgraded, 1 newly installed, 0 to remove and 4 not upgraded.
 Need to get 46.2 kB of archives.
 After this operation, 8192 B of additional disk space will be used.
 Get:1 http://us.archive.ubuntu.com/ubuntu/ trusty/universe ssmtp amd64 2.64-7 [46.2 kB]
 Preconfiguring packages ...
 Fetched 46.2 kB in 0s (366 kB/s)
 Selecting previously unselected package ssmtp.
 (Reading database ... 70521 files and directories currently installed.)
 Preparing to unpack .../ssmtp_2.64-7_amd64.deb ...
 Unpacking ssmtp (2.64-7) ...
 Processing triggers for man-db (2.6.7.1-1ubuntu1) ...
 Setting up ssmtp (2.64-7) ...
msg: '/usr/bin/apt-get -y -o "Dpkg::Options::=--force-confdef" -o "Dpkg::Options::=--force-confold" install 'ssmtp'' failed: hostname: Name or service not known
 dpkg: error processing package ssmtp (--configure):
 subprocess installed post-installation script returned error exit status 1
 Errors were encountered while processing:
 ssmtp
 E: Sub-process /usr/bin/dpkg returned an error code (1)

I found an issue mentioning it here at Github https://github.com/roots/trellis/issues/148 suggesting to check /etc/hosts and /etc/hostname. Hostname had imagewize (copied from Dreamhost instance name given). Still working on this issue and adding possible FQDN . They also suggested using a domain name, but I guess I my case it was using one domain too many. In the end I moved one staging site to another server.

 

Publickey – Permission denied

Another issue I run into sometimes is an issue with the SSH public key:

GATHERING FACTS ***************************************************************
fatal: [xx.xxx.xx.xx] => SSH Error: Permission denied (publickey).
while connecting to xx.xxx.xx:22
It is sometimes useful to re-run the command using -vvvv, which prints SSH debug output to help diagnose the issue.

Often this means one of these issues:

Deployment Errors

Another error trying to deploy  I got was really odd or by the looks of it initially as it mentioned paths to Ansible files I could not locate right away and stuff leading to SSH key issues in the end:

./deploy.sh staging domain.com
PLAY [Deploy WP site] *********************************************************
GATHERING FACTS ***************************************************************
ok: [xx.xxx.xx.xx]

TASK: [deploy | Initialize] ***************************************************
failed: [xx.xxx.xx.xx] => {"failed": true, "parsed": false}
Traceback (most recent call last):
File "/home/web/.ansible/tmp/ansible-tmp-1451380440.18-256782094434614/deploy_helper", line 2026, in
main()
File "/home/web/.ansible/tmp/ansible-tmp-1451380440.18-256782094434614/deploy_helper", line 382, in main
changes += deploy_helper.create_path(facts['project_path'])
File "/home/web/.ansible/tmp/ansible-tmp-1451380440.18-256782094434614/deploy_helper", line 276, in create_path
os.makedirs(path)
File "/usr/lib/python2.7/os.py", line 150, in makedirs
makedirs(head, mode)
File "/usr/lib/python2.7/os.py", line 157, in makedirs
mkdir(name, mode)
OSError: [Errno 13] Permission denied: '/srv/www'
OpenSSH_6.9p1, LibreSSL 2.1.8
debug1: Reading configuration data /Users/jasper/.ssh/config
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 21: Applying options for *
debug1: auto-mux: Trying existing master
debug1: mux_client_request_session: master session id: 2
Shared connection to xx.xxx.xx.xx closed.

Error was related to previous provision issue and a clean slate with a clean remote server and new Trellis installation was all I could come up with to solve it all.

NGINX Error 404

I also bumped into a:

404 Not Found
 nginx

error. This was simply due to the fact that I had provisioned, but hadn’t deployed. Yet. Once that was done all was well.

Wrong Subtree

TASK: [deploy | Fail if project_subtree_path is set incorrectly] ************** 
failed: [xxx.xxx.xx.xx] => {"failed": true}
msg: subtree is set to 'site' but that path does not exist in the repo. Edit `subtree_path` for 'domain.com' in `wordpress_sites.yml`.

Make sure you have added the correct repo to the wordpress_sites.yml and that the subtree is correctly set. Last time I switched from Github to Bitbucket (free private repos!) but I did not adjust the repository.

Repo Access Error

Had an issue a few times that I suddenly had an error that the remote repository could no longer be accessed while I could push and pull from the Github repository. This made it impossible to deploy. See the errors below:

TASK [deploy : Clone project files] ********************************************
System info:
 Ansible 2.1.1.0; Darwin
 Trellis at "Fix #639 - WP 4.6 compatibility: update WP-CLI to 0.24.1"
---------------------------------------------------
Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

fatal: [domain.nl]: FAILED! => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
...ignoring

TASK [deploy : Failed connection to remote repo] *******************************
System info:
 Ansible 2.1.1.0; Darwin
 Trellis at "Fix #639 - WP 4.6 compatibility: update WP-CLI to 0.24.1"
---------------------------------------------------
Git repo git@github.com:jasperf/domain.nl.git cannot be accessed. Please
verify the repository exists and you have SSH forwarding set up correctly.
More info:
> https://roots.io/trellis/docs/deploys/#ssh-keys
> https://roots.io/trellis/docs/ssh-keys/#cloning-remote-repo-using-ssh-
agent-forwarding

fatal: [domain.nl]: FAILED! => {"changed": false, "failed": true}

NO MORE HOSTS LEFT *************************************************************
 [WARNING]: Could not create retry file 'deploy.retry'. [Errno 2] No such
file or directory: ''

The issue in the end was probably related to my upgrade to OSX Sierra and some system and or keychain changes. I had to enter my passphrase earlier. And somehow the key had to be added to the keychain again as well. This as a simple:

ssh-add -K
Identity added: /Users/jasper/.ssh/id_rsa (/Users/jasper/.ssh/id_rsa)

After that I could access all again and deploy the latest changes to the site.

7 Responses

  1. I stumbled across your site because I also encountered the //unix:/var/run/php-fpm-wordpress.sock error.

    I figured it out the hard way. I didn’t have the ability to blow away a client’s digital ocean drop, so I had to manually troubleshoot. I’m coming from Apache and starting to learn nginx.

    The trick is you have to restart the php fpm service, not nginx anytime you modify php-fpm config files

    service php7.0-fpm restart

    1. Just had

      php7.1-fpm.service is not active, cannot reload.
      fatal: [site.com]: FAILED! => {"changed": true, "cmd": "sudo service php7.1-fpm reload", "delta": "0:00:00.046387", "end": "2017-04-21 05:56:11.802013", "failed": true, "rc": 1, "start": "2017-04-21 05:56:11.755626", "stderr": "php7.1-fpm.service is not active, cannot reload.", "stdout": "", "stdout_lines": [], "warnings": []}

      doing a production server deployment. When I did a restart as root

      service php7.1-fpm restart
      Job for php7.1-fpm.service failed because the control process exited with error code. See "systemctl status php7.1-fpm.service" and "journalctl -xe" for details.

      Then I decided to do a reboot and afterwards I was able to deploy again.

  2. Thanks so much for covering that Ansible version, ‘TaskInclude’ problem. It’s so difficult to find a decisive answer in the usual places sometimes.

    J

  3. Thank you so much for this useful post. However, in the last error you mention – “Repo Access Error”, I have done everything you said but it’s not working. Would you kindly take a look at my issue right now, which is on roots discourse:

    https://discourse.roots.io/t/deploy-hanging-indefinitely-at-copy-project-files-step/9208

    I really don’t know what to do next to make this work, it would be greatly appreciated if you could spend a couple of minutes to help me. Thank you and have a good day.

    1. Seem you got some great suggestions on the Roots Discourse forum already. Try as suggested:

      “Good news. Trellis gives you a means to make the git.conceptual.site host known ahead of time, avoiding all the trouble above. You could add git.conceptual.site to group_vars/all/known_hosts.yml. This command may help you find the key to add:
      ssh-keyscan git.conceptual.site”

Leave a Reply to Jasper Cancel reply

Your email address will not be published. Required fields are marked *