Tag Archives: tutorial

owncloud vs pydio – more diy cloud storage

Last week I wrote a how-to on using Pydio as a front-end to a MooseFS distributed data storage cluster.

The big complaint I had while writing that was that I wanted to use ownCloud, but it doesn’t Just Work™ on CentOS 6*.

After finishing the tutorial, I decided to do some more digging – because ownCloud looks cool. And because it bugged me that it didn’t work on CentOS 6.

What I found is that ownCloud 8 doesn’t work on CentOS 6 (at least not easily).

The simple install guide and process really is about version 8, and the last one that can be speedy-installed is 7. And as everyone knows, major version releases often make major changes in how they work. This appears to be very much the case with ownCloud going from 7 to 8.

In fact, the two pages needed for installing ownCloud are so easy to follow, I see no reason to copy them here. It’s literally three shell commands followed by a web wizard. It’s almost too easy.

You need to have MySQL/MariaDB installed and ready to accept connections (or use SQLite) – make a database, user, and give the user perms on the db. And you need Apache installed and running (along with PHP – but yum will manage that for you).

If you’re going to use MooseFS (or any other similar tool) for your storage backend to ownCloud, be sure, too, to bind mount your MFS mount point back to the ownCloud data directory (by default it’s /var/www/html/owncloud/data). Note: you could start by using local storage for ownCloud, and only migrate to a distributed setup later.

Pros of Pydio

  • very little futzing needed to make it work with CentOS 6
  • very clean user management
  • very clean webui
  • light system requirements (doesn’t even require a database)

Pros of ownCloud

  • apps available for major mobile platforms (iOS, Android), desktop)
  • no futzing needed to work with CentOS 7
  • very clean user management
  • clean webui

Cons of Pydio

  • no interface except the webui

Cons of ownCloud

  • needs a database
  • heavier system requirements
  • doesn’t like CentOS 6

What about other cloud environments like Seafile? I like Seafile, too. Have it running, in fact. Would recommend it – though I think there are better options now than it (including ownCloud & Pydio).

*Why do I keep harping on the CentOS 6 vs 7 support / ease-of-use? Because CentOS / RHEL 7 is different from previous releases. I covered that it was different for the Blue Grass Linux User Group a few months ago. Yeah, I know I should be embracing the New Way™ of doing things – but like most people, I can be a technical curmudgeon (especially humorous when you consider I work in a field that is about not being curmudgeonly).

Guess this means I really need to dive into the new means of doing things (mostly the differences in how services are managed) – fortunately, the Fedora Project put together this handy cheatsheet. And Digital Ocean has a clew of tutorials on basic sysadmin things – one I used for this comparison was here.

create your own clustered cloud storage system with moosefs and pydio

This started-off as a how-to on installing ownCloud. But their own installation procedures don’t work for the 8.0x release and CentOS 6.

Most of you know I’ve been interested in distributed / cloud storage for quite some time.

And that I find MooseFS to be fascinating. As of 2.0, MooseFS comes in two flavors – the Community Edition, and the Professional Edition. This how-to uses the CE flavor, but it’d work with the Pro version, too.

I started with the MooseFS install guide (pdf) and the Pydio quick start steps. And, as usual, I used Digital Ocean to host the cluster while I built it out. Of course, this will work with any hosting provider (even internal to your data center using something like Backblaze storage pods – I chose Digital Ocean because they have hourly pricing; Chunk Host is a “better” deal if you don’t care about hourly pricing). In many ways, this how-to is in response to my rather hackish (though quite functional) need to offer file storage in an otherwise-overloaded lab several years back. Make sure you have “private networking” (or equivalent) enabled for your VMs – don’t want to be sharing-out your MooseFS storage to just anyone 🙂

Also, as I’ve done in other how-tos on this blog, I’m using CentOS Linux for my distro of choice (because I’m an RHEL guy, and it shortens my learning curve).

With the introduction out of the way, here’s what I did – and what you can do, too:


  • spin-up at least 3 (4 would be better) systems (for purposes of the how-to, low-resource (512M RAM, 20G storage) machines were used; use the biggest [storage] machines you can for Chunk Servers, and the biggest [RAM] machine(s) you can for the Master(s))
    • 1 for the MooseFS Master Server (if using Pro, you want at least 2)
    • (1 or more for metaloggers – only for the Community edition, and not required)
    • 2+ for MooseFS Chunk Servers (minimum required to ensure data is available in the event of a Chunk failure)
    • 1 for ownCloud (while this might be able to co-reside with the MooseFS Master – this tutorial uses a fully-separate / tiered approach)
  • make sure the servers are either all in the same data center, or that you’re not paying for inter-DC traffic
  • make sure you have “private networking” (or equivalent) enabled so you do not share your MooseFS mounts to the world
  • make sure you have some swap space on every server (may not matter, but I prefer “safe” to “sorry”) – I covered how to do this in the etherpad tutorial

MooseFS Master

  • install MooseFS master
    • curl “http://ppa.moosefs.com/RPM-GPG-KEY-MooseFS” > /etc/pki/rpm-gpg/RPM-GPG-KEY-MooseFS && curl “http://ppa.moosefs.com/MooseFS-stable-rhsysv.repo” > /etc/yum.repos.d/MooseFS.repo && yum -y install moosefs-master moosefs-cli
  • make changes to /etc/mfs/mfsexports.cfg
    • # Allow everything but “meta”.
    • #* / rw,alldirs,maproot=0
    • / rw,alldirs,maproot=0
  • add hostname entry to /etc/hosts
    • mfsmaster
  • start master
    • service moosefs-master start
  • see how much space is available to you (none to start)
    • mfscli -SIN

MooseFS Chunk(s)

  • install MooseFS chunk
    • curl “http://ppa.moosefs.com/RPM-GPG-KEY-MooseFS” > /etc/pki/rpm-gpg/RPM-GPG-KEY-MooseFS && curl “http://ppa.moosefs.com/MooseFS-stable-rhsysv.repo” > /etc/yum.repos.d/MooseFS.repo && yum -y install moosefs-chunkserver
  • add the mfsmaster line from previous steps to /etc/hosts
    • cat >> /etc/hosts
    • mfsmaster
    • <ctrl>-d
  • make your share directory
    • mkdir /mnt/mfschunks
  • add your freshly-made directory to the end of /etc/mfshdd.cfg, with a size you want to share
    • /mnt/mfschunks 15GiB
  • start the chunk
    • service moosefs-chunkserver start
  • on the MooseFS master, make sure your new space has become available
    • mfscli -SIN
  • repeat for as many chunks as you want to have

Pydio / MooseFS Client

  • install MooseFS client
    • curl “http://ppa.moosefs.com/RPM-GPG-KEY-MooseFS” > /etc/pki/rpm-gpg/RPM-GPG-KEY-MooseFS && curl “http://ppa.moosefs.com/MooseFS-stable-rhsysv.repo” > /etc/yum.repos.d/MooseFS.repo && yum -y install moosefs-client
  • add the mfsmaster line from previous steps to /etc/hosts
    • cat >> /etc/hosts
    • mfsmaster
    • <ctrl>-d
  • mount MooseFS share somewhere where Pydio will be able to get to it later (we’ll use a bind mount for that in a while)
    • mfsmount /mnt/mfs -H mfsmaster
  • install Apache and PHP
    • yum -y install httpd
    • yum -y install php-common
      • you need more than this, and hopefully Apache grabs it for you – I installed Nginx then uninstalled it, which brought-in all the PHP stuff I needed (and probably stuff I didn’t)
  • modify php.ini to support large files (Pydio is exclusively a webapp for now)
    • memory_limit = 384M
    • post_max_size = 256M
    • upload_max_filesize = 200M
  • grab Pydio
    • you can use either the yum method, or the manual – I picked manual
    • curl http://hivelocity.dl.sourceforge.net/project/ajaxplorer/pydio/stable-channel/6.0.6/pydio-core-6.0.6.tar.gz
      • URL correct as of publish date of this blog post
  • extract Pydio tgz to /var/www/html
  • move everything in /var/www/html/data to /mnt/moosefs
  • bind mount /mnt/moosefs to /var/www/html/data
    • mount –bind /mnt/moosefs /var/www/html/data
  • set ownership of all Pydio files to apache:apache
    • cd /var/www/html && chown -R apache:apache *
    • note – this will give an error such as the following screen:
    • Screen Shot 2015-04-20 at 16.32.48this is “ok” – but don’t leave it like this (good enough for a how-to, not production)
  • start Pydio wizard
  • fill-in forms as they say they should be (admin, etc)
    • I picked “No DB” for this tutorial – you should use a database if you want to roll this out “for real”
  • login and starting using it

Screen Shot 2015-04-20 at 17.07.51

Now what?

Why would you want to do this? Maybe you need an in-house shared/shareable storage environment for your company / organization / school / etc. Maybe you’re just a geek who likes to play with new things. Or maybe you want to get into the reselling business, and being able to offer a redundant, clustered, cloud, on-demand type storage service is something you, or your customers, would find profitable.

Caveats of the above how-to:

  • nothing about this example is “production-level” in any manner (I used Digital Ocean droplets at the very small end of the spectrum (512M memory, 20G storage, 1 CPU))
    • there is a [somewhat outdated] sizing guide for ownCloud (pdf) that shows just how much it wants for resources in anything other than a toy deployment
    • Pydio is pretty light on its basic requirements – which also helped this how-to out
    • while MooseFS is leaner when it comes to system requirements, it still shouldn’t be nerfed by being stuck on small machines
  • you shouldn’t be managing hostnames via /etc/hosts – you should be using DNS
    • DNS settings are far more than I wanted to deal with in this tutorial
  • security has, intentionally, been ignored in this how-to
    • just like verifying your inputs is ignored in the vast majority of programming classes, I ignored security considerations (other than putting the MooseFS servers on non-public-facing IPs)
    • don’t be dumb about security – it’s a real issue, and one you need to plan-in from the very start
      • DO encrypt your file systems
      • DO ensure your passwords are complex (and used rarely)
      • DO use key-based authentication wherever possible
      • DON’T be naive
  • you should be on the mailing list for MooseFS and Pydio forum.
    • the communities are excellent, and have been extremely helpful to me, even as a lurker
  • I cannot answer more than basic questions about any of the tools used herein
  • why I picked what I picked and did it the way I did
    • I picked MooseFS because it seems the easiest to run
    • I picked Pydio because the ownCloud docs were borked for the 8.0x release on CentOS 6 – and it seems better than alternatives I could find (Seafile, etc) for this tutorial
    • I wanted to use ownCloud because it has clients for everywhere (iOS, Android, web, etc)
    • I have no affiliation with either MooseFS or Pydio beyond thinking they’re cool
    • I like learning new things and showing them off to others

Final thoughts

Please go make this better and show-off what you did that was smarter, more efficient, cheaper, faster, etc. Turn it into something you could deploy as an AMID on AWS. Or Docker containers. Or something I couldn’t imagine. Everything on this site is licensed under the CC BY 3.0 – have fun with what you find, make it awesomer, and then tell everyone else about it.

I think I’ll give LizardFS a try next time – their architecture is, diagrammatically, identical to the “pro” edition of MooseFS. And it’d be fun to have experience with more than one solution.

automatically extract email attachments with common linux tools

I had need to automatically process emails to a specific address to pull attachments out, and this is how I did it:

$ yum install mpack

$ cat extract-attach.sh 
rm -rf ~/attachtmp
mkdir ~/attachtmp
mv ~/Maildir/new/* ~/attachtmp
cd ~
munpack ~/attachtmp/*
rm -rf ~/attachtmp

$ crontab -l
*/5 * * * *	~/extract-attach.sh

Why, you may ask? Because I get a report a few times per day to the email address in question.

Note – this runs in my crontab every 5 minutes on a CentOS 6 x64 server; I’m sure the process is similar/identical on other distros, but I haven’t personally tried.

setting up an unreal irc server on centos 6

Ever want to run an IRC server? I recently set one up at irc.datente.com using a Digital Ocean VM running CentOS 6.5 x64.

Here’s what I did, if you want to replicate the process for yourself (full documentation available from Unreal’s website):

  • acquire CentOS 6.5 x64 server (as I mentioned, I used Digital Ocean)
  • `yum -y install screen wget gcc`
  • `yum -y upgrade`
  • `adduser unreal`
  • `su – unreal`
  • download Unreal to your server (http://www.unrealircd.com/downloads/unreal/source – `wget http://www.unrealircd.com/downloads/Unreal3.2.10.2.tar.gz`)
  • `tar zxf Unreal*.gz`
  • `cd Unreal*`
  • `make clean`
  • `./Config`
    • answer prompts – most can be left default
  • `make`
  • `cp doc/example.conf unrealircd.conf`
  • edit unrealircd.conf (use your editor of choice)
    • see sample config file below for what I did (minus passwords / emails)
  • if all has gone well, start Unreal
    • `screen ./unreal start`
  • create a startup script to ensure Unreal launches on reboot as user `unreal`

That’s it. Thankfully, while the config file isn’t pleasant to play with, it’s a lot better than it used to be.

loadmodule "src/modules/commands.so";
loadmodule "src/modules/cloak.so";

include "help.conf";
include "badwords.channel.conf";
include "badwords.message.conf";
include "badwords.quit.conf";
include "spamfilter.conf";

        name "your.irc.server.tld";
        info "Your IRC Server";
        numeric 1;

admin {
        "Your Name";

class           clients
        pingfreq 90;
        maxclients 500;
        sendq 100000;
        recvq 8000;

class           servers
        pingfreq 90;
        maxclients 10;          /* Max servers we can have linked at a time */
        sendq 1000000;
        connfreq 100; /* How many seconds between each connection attempt */

allow {
        ip             *@*;
        hostname       *@*;
        class           clients;
        maxperip 25;

/* Passworded allow line */
allow {
        ip             *@;
        hostname       *@*.passworded.ugly.people;
        class           clients;
        password "f00Ness";
        maxperip 1;

allow channel {
        channel "#WarezSucks";
        class "clients";

oper youroperatornick {
        class           clients;
        from {
                userhost bob@smithco.com;
        password "yourpassword";

listen         *:6697
// uncomment this line if you chose to compile Unreal with SSL support
//              ssl;

listen         *:8067;
listen         *:6667;

/* not linking to any other servers right now
link            hub.mynet.com
        username        *;
        bind-ip         *;
        port            7029;
        hub             *;
        password-connect "LiNk";
        password-receive "LiNk";
        class           servers;
                options {
                        /* Note: You should not use autoconnect when linking services */

ulines {

drpass {
        restart "I-love-to-restart";
        die "die-you-stupid";

log "ircd.log" {
        /* Delete the log file and start a new one when it reaches 20MB, leave this out to always use the 
           same log */
        maxsize 20971520;
        flags {

alias NickServ { type services; };
alias ChanServ { type services; };
alias OperServ { type services; };
alias HelpServ { type services; };
alias StatServ { type stats; };

alias "identify" {
        format "^#" {
                target "chanserv";
                type services;
                parameters "IDENTIFY %1-";
        format "^[^#]" {
                target "nickserv";
                type services;
                parameters "IDENTIFY %1-";
        type command;

alias "services" {
        format "^#" {
                target "chanserv";
                type services;
                parameters "%1-";
        format "^[^#]" {
                target "nickserv";
                type services;
                parameters "%1-";
        type command;

alias "identify" {
        format "^#" {
                target "chanserv";
                type services;
                parameters "IDENTIFY %1-";
        format "^[^#]" {
                target "nickserv";
                type services;
                parameters "IDENTIFY %1-";
        type command;

alias "glinebot" {
        format ".+" {
                command "gline";
                type real;
                parameters "%1 2d Bots are not allowed on this server, please read the faq at http://www.example.com/faq/123";
        type command;

        /* The Message Of The Day shown to users who log in: */
        /* motd ircd.motd; */

         * A short MOTD. If this file exists, it will be displayed to
         * the user in place of the MOTD. Users can still view the
         * full MOTD by using the /MOTD command.
        /* shortmotd ircd.smotd; */

        /* Shown when an operator /OPERs up */
        /* opermotd oper.motd; */

        /* Services MOTD append. */
        /* svsmotd ircd.svsmotd; */

        /* Bot MOTD */
        /* botmotd bot.motd; */

        /* Shown upon /RULES */
        /* rules ircd.rules; */

         * Where the IRCd stores and loads a few values which should
         * be persistent across server restarts. Must point to an
         * existing file which the IRCd has permission to alter or to
         * a file in a folder within which the IRCd may create files.
        /* tunefile ircd.tune; */

        /* Where to save the IRCd's pid. Should be writable by the IRCd. */
        /* pidfile ircd.pid; */

tld {
        mask *@*.fr;
        motd "ircd.motd.fr";
        rules "ircd.rules.fr";

/* note: you can just delete the example block above,
 * in which case the defaults motd/rules files (ircd.motd, ircd.rules)
 * will be used for everyone.

ban nick {
        mask "*C*h*a*n*S*e*r*v*";
        reason "Reserved for Services";

ban ip {
        reason "Delinked server";

ban server {
        mask eris.berkeley.edu;
        reason "Get out of here.";

ban user {
        mask *tirc@*.saturn.bbn.com;
        reason "Idiot";

ban realname {
        mask "sub7server";
        reason "sub7";

except ban {
        /* don't ban stskeeps */
        mask           *stskeeps@212.*;

deny dcc {
        filename "*sub7*";
        reason "Possible Sub7 Virus";

deny channel {
        channel "*warez*";
        reason "Warez is illegal";
        class "clients";

vhost {
        vhost           i.hate.microsefrs.com;
        from {
                userhost       *@*.image.dk;
        login           stskeeps;
        password        moocowsrulemyworld;

set {
        network-name            "Datente";
        default-server          "irc.datente.com";
        services-server         "irc.datente.com";
        stats-server            "irc.datente.com";
        help-channel            "#datente";
        hiddenhost-prefix       "rox";
        /* prefix-quit          "no"; */
        /* Cloak keys should be the same at all servers on the network.
         * They are used for generating masked hosts and should be kept secret.
         * The keys should be 3 random strings of 5-100 characters
         * (10-20 chars is just fine) and must consist of lowcase (a-z),
         * upcase (A-Z) and digits (0-9) [see first key example].
         * HINT: On *NIX, you can run './unreal gencloak' in your shell to let
         *       Unreal generate 3 random strings for you.
        cloak-keys {
        /* on-oper host */
        hosts {
                local           "locop.roxnet.org";
                global          "ircop.roxnet.org";
                coadmin         "coadmin.roxnet.org";
                admin           "admin.roxnet.org";
                servicesadmin   "csops.roxnet.org";
                netadmin        "netadmin.roxnet.org";
                host-on-oper-up "no";

set {
        kline-address "your@email.tld";
        modes-on-connect "+ixw";
        modes-on-oper    "+xwgs";
        oper-auto-join "#opers";
        options {
                /* You can enable ident checking here if you want */
                /* identd-check; */

        maxchannelsperuser 10;
        /* The minimum time a user must be connected before being allowed to use a QUIT message,
         * This will hopefully help stop spam */
        anti-spam-quit-message-time 10s;
        /* Make the message in static-quit show in all quits - meaning no
           custom quits are allowed on local server */
        /* static-quit "Client quit";   */

        /* You can also block all part reasons by uncommenting this and say 'yes',
         * or specify some other text (eg: "Bye bye!") to always use as a comment.. */
        /* static-part yes; */

        /* This allows you to make certain stats oper only, use * for all stats,
         * leave it out to allow users to see all stats. Type '/stats' for a full list.
         * Some admins might want to remove the 'kGs' to allow normal users to list
         * klines, glines and shuns.
        oper-only-stats "okfGsMRUEelLCXzdD";

        /* Throttling: this example sets a limit of 3 connection attempts per 60s (per host). */
        throttle {
                connections 3;
                period 60s;

        /* Anti flood protection */
        anti-flood {
                nick-flood 3:60;        /* 3 nickchanges per 60 seconds (the default) */

        /* Spam filter */
        spamfilter {
                ban-time 1d; /* default duration of a *line ban set by spamfilter */
                ban-reason "Spam/Advertising"; /* default reason */
                virus-help-channel "#help"; /* channel to use for 'viruschan' action */
                /* except "#help"; channel to exempt from filtering */

posting from google+ to other services with ifttt

I’ve been using If This Then That (best part? it’s free!) for several months, and wanted to share a simple way to post updates from Google+ (or any RSS feed, but I digress) to your other social media services.

Currently I only use Google+, LinkedIn, Twitter, and Facebook – though I am sure this basic process will work for any other ifttt-supported social media service which you can access in a write form (they call them channels).

There’s a way to use email to post updates to Facebook and Twitter, but I kinda like the ifttt method more – it’s more intuitive to me.

Here’s the basic method (or you can use the recipe I shared that does this):

  1. login/authorize GPlusRSS with your Google account
  2. copy the RSS feed GPlusRSS gives you of your G+ public posts
  3. login to ifttt and enable/authorize (if you haven’t previously) the channel for the social media service you want to post to (I’m using Twitter for this example)
  4. create a new recipe
    1. click “this”
    2. click “Feed” (at this point you can post everything or you can post some things – I’m going to go the some route here)
    3. click “New feed item matches”
    4. in “Keyword or simple phrase”, enter something unique-ish (I use “#twt”)
    5. in “Feed URL”, paste the URL GPlusRSS gave you at parent step 2
    6. click “Create Trigger”
    7. click “that”
    8. click the social media channel you chose in parent step 3
    9. click “Post a tweet”
    10. click “Create Action”
    11. in “Description”, give it a good name, such as “post G+ updates to Twitter if tagged #twt”
    12. click “Create Recipe”
  5. done

If Google+ ever decides to open their API better, ifttt should be able to have a channel for them.

Until then, the above method works like a champ – I use similar recipes for cross-posting to LinkedIn and Facebook from Google+ along with Twitter.