Here is my recommendation for quickly learning at least the basics of infrastructure:
Hardware:
- If you can start with a cluster of at least 3 computers.
- If not a single computer for a hypervisor will work
- If not then VirtualBox or VMWorkstation will get you started with the absolute basics until you can get a hypervisor
- If you have a couple old computers or laptops (at least 8gb ram each) then start with those.
- I recommend grabbing a few sff Dell or sff Lenovo or Intel nucs from eBay. If you can put a 1tb nvme in each one and as much RAM as possible.
- If you want to learn Phones get a cheap VOIP phone that will work with Astrix (I will explain this later).
- If you are interested in Enterprise equipment pick up a few old Servers and a Server Rack (this WILL be expensive)
- Use Labgopher to find good deals on used servers (https://www.labgopher.com/)
- If you want to focus on networking grab a used Firewall (Fortigate is one of the cheapest, Palo Alto is really fucking good, Sophos is pretty cheap, but I personally would avoid Cisco Firewalls) and a fancy Layer 3 Switch (Cisco, Juniper, Netgear, Ubiquiti are all popular brands)
- Also don’t forget to get a USB to rj45 serial cable for managing the switches and firewalls
Connect them all to your Lab network and install your hypervisor of choice onto all of them. If you can get 4 computers I would make one of them a dedicated NAS with something like TrueNAS or Unraid. The benefit of a NAS is that you can store your VMs on it and because it is on the nas shared by all nodes in the cluster you can do things like instant migration between hypervisor nodes. I’m not sure how important that is to learn or play around with, but it is super cool to see VMs pop between nodes in an instant. It also allows you to use High Availability (HA).
Hypervisor recommendations:
Proxmox:
- The best homelab hypervisor.
- Free, runs on anything, lots of features, good documentation.
- Fairly simple to get running, making VMs or LXC containers is trivial.
- Optimization is a little difficult to learn. There are a lot of settings relating to CPU and Disk that can make the VM run faster or slower depending on your setup. The Proxmox docs have pretty descriptions of the settings and what they do at a high level.
- Proxmox does not have full feature parity with VMware so there are tools that exist for esxi and vsphere that just don’t for Proxmox.
- Proxmox does not have a good way to administer multiple clusters. Probably not a problem for you but any company that will have multiple clusters will probably look for something else.
- There is a third party tool for managing multiple clusters that looks promising: https://corsinvest.it/cv4pve-admin-proxmox/
- Will run on anything. Actually anything. I have got it running on a bunch of chromeboxes with only 2gb of RAM each.
Nutanix:
- I have not used it. I hear that it works alright as long as you use their hardware.
- I think more medium to large companies will be moving to Nutanix over Proxmox, but I have no experience to back that up just a hunch.
- Free homelab version
- Hardware requirements to run it.
XCP-NG:
- I really don’t know a lot about it other than it is what very large companies appear to be using.
- Apparently it scales really well
- Free for homelab use
- Hardware requirements to run it.
VMware/ESXi:
- I refuse to recommend this garbage to anyone especially for homelab use, but there are still too many companies running on it.
- Doesn’t work on a lot of consumer hardware. There are work arounds but it is a pain in the ass to get it running on something like an Intel nuc.
- Industry standard, a lot of places are moving away from it, but apparently it is still the best option for certain small businesses and places where it is too ingrained to just make the switch to something else.
- They have some features that the other options can’t match. I think Vmotion is a big one for some places.
- Apparently for Windows VMs it has the best integration between the hypervisor and the VM. I can’t personally confirm that, other than my experience with old versions of windows (7 and earlier) do not run well on Proxmox.
- Hardware Requirements
- No free options anymore, and good luck convincing them to let you buy it as an individual.
- There are ISO files out there where you can install ESXi without a license.
OpenStack:
- Once again I don’t have much experience with this, but if you want to learn cloud this is like running AWS on your own hardware.
- Different from traditional hypervisors. OpenStack acts more like a personal cloud compared to a hypervisor.
- Should run on almost anything
- Difficult to set up and get running
- In my opinion it is much harder to learn than traditional hypervisors.
Learning:
So I’m by no means a professional when it comes to this, but this is my advice for learning things as quickly as possible.
Build a network:
- At least 2 Windows servers VMs running as DCs
- Get AD running you can use a tool called bad blood to automatically add a bunch of users and groups to the ad for testing/practice (https://github.com/davidprowe/BadBlood)
- You may want to set up AD sync or a separate Entra (Azure AD) account and either sync them or practice using them separately. Entra is AD for the cloud, and it does a few things differently. Entra is free for basiclly anything you will want to do with it, so I highly recommend trying it.
- Add DNS and DHCP roles to the Windows servers
- Make an SMB share on one of the Windows servers and serve a network share with any domain joined machine.
- Take it another step and set the permissions so that only users in a “Accounting share” group can access the share
- Take it even further and set more complex permissions for certain groups to only be able to access certain folders/files.
- Try making nested folders with a file in each with the groups that should be able to access it, then after you set your permissions check to see if you did it right.
- deploy a firewall. I recommend OPNSense, but PFsense, openwrt, VyOS, ipfire, any of those should work as well. You can get really into it and just use a Linux VM (Debian, OpenSUSE, or some other server distro) with iptables or ufw, or you can go for something more simple like Sophos XG, or if you want to experiment you can get trials for Palo Alto firewall VM or Fortigate VM. I like Labhub (https://labhub.eu.org/) for finding the VM disks for different firewalls. You can also use something like GNS3, or eve-ng or my personal favorite pnetlabs if you want to focus on just learning networking (firewalls routers switches).
- Make the firewall the default gateway, but have Windows handle DNS and DHCP. If you want to learn more switch to something like bind9 on a Linux VM in the future, but personally I find the Windows interface easier for learning the basics.
- I also recommend trying pi-hole for DNS.
- You can setup something called unbound and keep all DNS requests local and unbound will go directly to the root server. The idea is that Unbound offers the best privacy because no third party will know what website you are looking for.
- The other common technologies are bind, dnsmasq, PowerDNS oe something like DoH (DNS over HTTPS).
- Reverse Proxy (Nginx Proxy Manager, Nginx, Treafik, Caddy): The next step after DNS as far as simplifying things goes is using a Reverse Proxy. A reverse proxy takes a domain and redirects to the correct server. So if you have a DNS record for a windows server called win-dc1.MYDOMAIN.COM but the server is hosing a website on port 8080 you would have to go to win-dc1.MYDOMAIN.COM:8080 in order to access the website. With a reverse proxy you can point website.MYDOMAIN.COM to win-dc1 IP on port 8080. So with the reverse proxy you can type website.MYDOMAIN.COM into your browser and it will take you to win-dc1.MYDOMAIN.COM:8080. You can also handle encryption at the Proxy so you can use HTTPS without having to handle certs on each website/application you host.
- If you are confident in your security skills you can open ports 80,443 on your home firewall and forward the ports to the reverse proxy (or to the firewall for your lab then forward again to the reverse proxy) and if you own the domain (You should buy a domain they are dirt cheap) you can have your external DNS point to your home IP. Then you can access these websites from anywhere on the internet.
Deploying apps:
So now you have a functional network with DNS, DHCP and a firewall sitting between your lab and your home Internet. The next step in my opinion is to start running applications on your network to get an idea of how things interact on a network. For inspiration for things to set up look here:
https://github.com/awesome-selfhosted/awesome-selfhosted
https://github.com/awesome-selfhosted/awesome-selfhosted/blob/master/non-free.md
https://github.com/awesome-foss/awesome-sysadmin
- I know some people will disagree with me here but I would start with WordPress. WordPress has multiple prices to it that all have to interact for it to work. So you will have something like nginx or apache serving the website (with PHP), and something like MySQL or MariaDB running the database. Try to put those things on different machines. So host the nginx side on a new windows server and host the MariaDB on a Linux VM, then connect them. You can put it all on a single machine but I think that it helps to understand how machines interact on a network.
- Start using your AD:
- SSO with Authentik or Keycloak or Authelia. Connect your ad with all your users to a service that will allow SSO (Single Sign On) to any other services you delpoy. Your users will only have to remember a single log in for everything from logging into a workstation to connecting to custom applications (assuming the application can use odic or oauth2). If you want to go the cloud route here is where you would use Entra as your SSO. I am not as familiar with it, but it should be included with the free tier.
- Nextcloud. Set up Nextcloud and connect it to your SSO of choice. Create a group called Nextcloud and make sure only those users can access/use Nextcloud. Bonus points if you separate Nextcloud into nginx and a Database (Bonus, bonus if you can get it working on Windows). DO NOT USE THE AIO VERSION
- MediaWiki: Host your own Wikipedia. If you are especially made of money and skills you can download a full archive and host your own version of (a slightly outdated) Wikipedia. Otherwise you can make a clone of a Fandom wiki, or make your own wiki.
- Again bonus points for putting the DB on a different machine.
- I cloned the Wookiepedia when I was experimenting with MediaWiki, I was able to find the same layout and plugins that they were using and import 600k entries. All of this was done to test a webscraper that I was working on, so it shouldn’t take more than 2 hours to get a basic version working and less than a week to import all of Wookiepedia.
- Mattermost or RocketChat: Deploy your own Discord/Teams/Slack clone. Make sure encryption is working. Connect it to your SSO.
- Bonus points if you set it up to share a DB with one of your other apps.
- Bonus points if you can get encryption working between your Mattermost and DB/Proxy.
- Extra Bonus if you deploy it with Kubernetes
Securing and Monitoring:
- Deploy a monitoring service or a RMM (Remote Management and Monitoring) like Zabbix to monitor for when your VMs or services go offline. Deploy agents to everything on your network.
- Bonus points for configuring Push notifications or emails with SMPT
- Bonus points for using encryption
- Bonus points for deploying the agents to all machines with Ansible
- Bonus points for proxying your local network to a Zabbix dashboard running in the cloud
- Bonus points if you use Zabbix: Add MeshCentral for Remote Management since Zabbix only does monitoring
- Leverage automation tools like Ansible to automatically:
- Rotate local Admin creds every 24 hours.
- Deploy SSH keys to all machines
- Mass install/uninstall a program on each computer
- Backups: You need backups. Ideas to get you started:
- For VMs on Proxmox you can use the integrated backup system to automatically schedule daily backups.
- ZFS backups with Sanoid: https://github.com/jimsalterjrs/sanoid
- File/folder sync with syncthing: https://syncthing.net/
- Proxmox backup server (Can backup proxmox nodes as well as VMs): https://www.proxmox.com/en/proxmox-backup-server
- VPN deployment options:
- Simple: Wireguard, OpenVPN
- Intermediate (and paid): TailScale, ZeroTier, Firezone, etc
- Intermediate (but free): OpenVPN using OPNSense, Self hosted OpenVPN (without the script), OpenConnect VPN, or SoftEtherVPN.
- Advanced (but free): Self host Netbird.
- More Advanced (but free): Compile and Self host Firezone
- You hate yourself and security: PPTP or L2TP. L2TP with IPsec is better but honestly there are better options?
- VLANS for everything. Separate your network out into a bunch of different VLANs for things like Management, VOIP, Workstations, Etc.
- DMZ: Put anything that is or would be “internet facing” in a DMZ. This will isolate those machines from the rest of the network so that if they get hacked the hackers cannot pivot to the rest of the network. I would put the reverse proxy here then another “WebDMZ” for the webservers. Basically you want to isolate anything that the internet touches. If they require access to internal resources do your best to ONLY allow the specific services you need through to reduce the attack vectors.
Bonus Knowledge:
- VOIP Phones. Buy yourself a cheap VOIP phone and deploy FreePBX or Wazo. Connect them to a SIP provider and start receiving calls!
- Bonus points for IVR
- Bonus points for multiple phones with extension numbers
- Bonus points for connecting VOIP apps on your desktop
- Bonus points for using a seperate VLAN for VOIP with QoS (Quality of service) rules to prioritize VOIP traffic
- Deploy a documentation server for documenting your Network as you build it. I recommend Bookstack, but there are lots of options out there both paid and free. Personally I think you should self-host it but if you don’t trust yourself yet get something that runs in the cloud.
- Make a CA (certificate authority) or PKI (Public Key Infrastructure) on your network and deploy the certs to all of your webservers and add the CA to any workstations on the network so your users can use HTTPS and trust the CA that they come from.
- This is hard to do, and I have yet to find a good reason not to just use a reverse proxy and Lets Encrypt for this purpose yet.
- You can use your CA to sign RDP certs and LDAPS (LDAP over SSL) for better security and to make sure your CA is trusted by all machines on the network.
- The CA will still not be trusted by outside devices like phones or laptops that are not domain joined
- HA and Failover for your network stack:
- Proxmox: Use HA and groups on important VMs like your DCs and Reverse Proxy.
- Group your Proxmox nodes so that the two Windows DCs will never share the same node in case that node goes down.
- Tell Proxmox the state you want the VMs in (Started).
- Set up the Watchdog in the monitoring tab for each VM to shutdown or reset so that if the VM freezes it will shutdown and Proxmox will try to restart it with the HA policy.
- Proxmox: Use HA and groups on important VMs like your DCs and Reverse Proxy.