2024-05-14

CentOS Linux 7: End of Life 2024-06-30

CentOS 7 EOL

If this looks a lot like the CentOS Stream 8 EOL content.. well it is because they aren't too different. It is just that instead of doing this once every 4 years, we get to do this twice in one year.
 
The last CentOS Linux release, CentOS Linux 7, will reach its end of life on June 30th, 2024. When that happens, the CentOS Infrastructure will follow the standard practice it has been doing since the early days of CentOS Linux:
  1. Move all content to https://vault.centos.org/
  2. Stop the mirroring software from responding to EL-7 queries.  
  3. Additional software for EL-7 may also be removed from other locations.

The first change usually causes any mirror doing an rsync with delete options to remove the contents from their mirror. The second change will cause the yum commands to break with errors. The vault system is also a single server with few active mirrors of its contents. As such, it is likely to be overloaded and very slow to respond as additional requests are made of it. 

 At the same time, the EPEL software on https://dl.fedoraproject.org/pub/epel/7 will be moved to /pub/archive/epel/7 and the mirrormanager for that will be updated appropriately.

In total, if you are using CentOS Linux 7 in either production or a CI/CD system, you can expect a lot of errors on July 1st or shortly afterwords.

What you must do!

There are several steps you can do to get ahead of the tsunami of problems:
  1. You can convert to Red Hat Enterprise Linux 7 and see about getting a Extended LifeCycle Support contract with the system. This is a stop gap measure for you to move toward a newer release of a Linux operating system.
  2. You can move your system to a replacement Enterprise Linux distribution. The Alma Project, Rocky Linux, Oracle and Red Hat all offer tools which can transition an EL7 system to a version of the operating system they will support for several years.
  3. If you are not able to move your systems in the next 45 days, you should look at mirroring the CentOS Linux 7 operating system to a more local location and move your update configs to use your mirror. With the large size of systems which could potentially try to use the vault system, I would not expect this to be very useful. As you will probably need to reinstall, add software or do continuous CI/CD in other areas.. you should keep a copy of the operating system local to your networks.

 References:

2024-05-13

CentOS Stream 8 END OF LIFE : 2024-05-31

CentOS Stream 8 EOL

CentOS Stream 8 will reach its end of life on May 31st, 2024. When that happens, the CentOS Infrastructure will follow the standard practice it has been doing since the early days of CentOS Linux:
  1. Move all content to https://vault.centos.org/
  2. Stop the mirroring software from responding to EL-8 queries. 

The first change usually causes any mirror doing an rsync with delete options to remove the contents from their mirror. The second change will cause the dnf or yum commands to break with errors. The vault system is also a single server with few active mirrors of its contents. As such, it is likely to be overloaded and very slow to respond as additional requests are made of it.

In total, if you are using CentOS Stream 8 in either production or a CI/CD system, you can expect a lot of errors on June 1st or shortly afterwords. 

What you can do!

There are several steps you can do to get ahead of a possible tsunami of problems:
  1. You can look to moving to a newer release of CentOS Stream before the end of the month. This usually will require deployment of new images or installs versus straight updates. 
  2. You can see if any of the 'move my Enterprise Linux' tools have added support for moving from CentOS Stream 8 to their EL8.10. For releases before 8.10, this was very hard because CentOS Stream 8 was usually the next release, but 8.10 is at a point where Alma, Oracle, Red Hat Enterprise Linux, or Rocky are at the same revisions or newer.
  3. You can start mirroring the CentOS Stream 8 content into your infrastructure and point any CI/CD or other systems to that mirror which will allow you to continue to function.

Mirroring CentOS Stream

Of the three options, I recommend the third. However in working out what is needed to mirror CentOS Stream, I realized I needed newer documentation and it would probably be a long post in itself. For a shorter version for self-starters, I recommend the documentation on the CentOS wiki,  https://wiki.centos.org/HowTos(2f)CreateLocalMirror.html  While the information was written for CentOS Linux 6 which was end-of-lifed in 2020, it covers most of the instructions needed. The parts which may need updating is the amount of disk space required for CentOS Stream which seems to be about 280 GB for everything and maybe around 120GB for any one architecture. 

 References:

 

Recent EPEL dnf problems with some EL8 systems

Last week there were several reports about various systems having problems with EPEL-8 repositories. The problems started shortly after various systems in Fedora had been updated from Fedora Linux 38 to Fedora 40, but the problems were not happening to all EL-8 systems. The problems were root-caused to the following:

  1. Fedora 40 had createrepo-1.0 installed which defaults to using zstd compression for various repositories.
  2. The EL8 systems which were working had some version of libsolv-0.7.20 installed which links against libzstd and so works.
  3. The EL8 systems which did not work either had versions of libsolv before 0.7.20 (and were any EL before 8.4) OR they had versions of libsolv-0.7.22

The newer version of libsolv was traced down to coming from repositories of either Red Hat Satellite or a related tool pulp, and was a rebuild of the EL9 package. However it wasn't a complete rebuild as the EL9 version has libzstd support, but the EL8 rebuild did not. 

A band-aid fix was made by Fedora Release Engineering to have EPEL repositories use the xz compression method for various files in /repodata/ versus libzstd. This is a slower compression method, but is possible to be used with all of the versions of libsolv reported in the various bugs and issue trackers.

This is a band-aid fix because users will run into this with any other repositories using the newer createrepo. A fuller fix will require affected systems to do one of the following:

  1. If the system is either Red Hat Enterprise Linux, Rocky Linux, Alma Linux, or Oracle Enterprise Linux and not running EL-8.9 or later, they need to upgrade to that OR point to an older version of a repository in https://dl.fedoraproject.org/pub/archive/epel/
  2. If the system has versions of libsolv-0.7.22 without libzstd support, they should contact the repository to see if a rebuild with libzstd support can be made. 
  3. If the system is CentOS Stream 8, then you should be making plans to upgrade to a different operating system as that version will be EOL on May 31st 2024.

External References

 

2023-04-21

Note To Future Self: Lenovo Laptop USB-C Mini Dock Reset

Hello Future Self,

Past Self here leaving you a note since I forgot to do so last time.

The Problem

When running Linux on a Lenovo, there are times where a firmware update will cause problems with the USB-C Mini Dock afterwards. In the previous 2 cases, the USB-C's RTL network will no longer show up as a seen device. External monitors plugged into the dock may also not function correctly, but it only happened once so I am not sure about that.

Diagnosis of the problem is that the system will complain of no internet connection, and commands will show something like the following (output altered):


ssmoogen@ssmoogen-rh:~$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp0s31f6: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN group default qlen 1000
    link/ether ab:cd:12:34:56:78 brd ff:ff:ff:ff:ff:ff
3: wlp82s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether 9a:bc:de:f1:23:45 brd ff:ff:ff:ff:ff:ff permaddr aa:aa:bb:bb:cc:cc

The part that has confused me at least once before is that the enp0s31f6 says NO-CARRIER which made me believe that the switch had problems. Looking at the USB-C showed that the link light was on and there was traffic. This led me to try various solutions which were wrong.

Attempted Solutions

In trying to diagnosis this in the past, I tried backing out all the firmware updates to see if they would untrigger the bricking of the connection. The fwupdtool worked great to do this, and I was able to back down through 8 firmwares without a hitch. However the network still said it was offline.

Next I went through older kernels and tried booting with a USB stick. All of them continued to show the e1000e as disconnected. Finally I went through the journalctl command to look for previous boots and what networks were shown up.


ssmoogen@ssmoogen-rh:~$ journalctl | grep eth0 | tail -n 100
....
Apr 20 16:54:37 ssmoogen-rh.localdomain kernel: e1000e 0000:00:1f.6 eth0: Intel(R) PRO/1000 Network Connection
Apr 20 16:54:37 ssmoogen-rh.localdomain kernel: e1000e 0000:00:1f.6 eth0: MAC: 13, PHY: 12, PBA No: FFFFFF-0FF
Apr 20 16:54:37 ssmoogen-rh.localdomain kernel: e1000e 0000:00:1f.6 enp0s31f6: renamed from eth0
Apr 20 16:54:38 ssmoogen-rh.localdomain kernel: r8152 4-2.1.2:1.0 eth0: v1.12.13
Apr 20 16:54:38 ssmoogen-rh.localdomain kernel: r8152 4-2.1.2:1.0 enp9s0u2u1u2: renamed from eth0
Apr 20 17:09:24 ssmoogen-rh.localdomain kernel: e1000e 0000:00:1f.6 eth0: (PCI Express:2.5GT/s:Width x1) aa:aa:aa:bb:bb:bb
Apr 20 17:09:24 ssmoogen-rh.localdomain kernel: e1000e 0000:00:1f.6 eth0: Intel(R) PRO/1000 Network Connection
Apr 20 17:09:24 ssmoogen-rh.localdomain kernel: e1000e 0000:00:1f.6 eth0: MAC: 13, PHY: 12, PBA No: FFFFFF-0FF
Apr 20 17:09:24 ssmoogen-rh.localdomain kernel: e1000e 0000:00:1f.6 enp0s31f6: renamed from eth0
Apr 20 17:13:56 ssmoogen-rh.localdomain kernel: e1000e 0000:00:1f.6 eth0: (PCI Express:2.5GT/s:Width x1) aa:aa:aa:bb:bb:bb
Apr 20 17:13:56 ssmoogen-rh.localdomain kernel: e1000e 0000:00:1f.6 eth0: Intel(R) PRO/1000 Network Connection
Apr 20 17:13:56 ssmoogen-rh.localdomain kernel: e1000e 0000:00:1f.6 eth0: MAC: 13, PHY: 12, PBA No: FFFFFF-0FF
Apr 20 17:13:56 ssmoogen-rh.localdomain kernel: e1000e 0000:00:1f.6 enp0s31f6: renamed from eth0
...

The Solution

Matching up the timestamps I see that there was an additional device see before the updates occurred. This was using the r8152 driver but after the firmware updates it no longer showed up. This finally triggered a memory of an email I had seen where someone else had a similar problem. Going through my email archives, I found that the solution they had found was to unplug the USB-C Dock for 1 minute and then plug everything back in. Sure enough, doing this restored the RTL driver and my network was restored. The e1000 was a red herring as it is somewhere internal to the laptop and probably available through a dongle which I forgot about as there doesn't seem to be a RJ-45 jack I could find on the exterior of the laptop.

Anyway, when this happens again, please remember this letter and save yourself 2 hours of firmware resets and kernel reboots. Sometimes completely turning off the hardware (remove all power from the Dock including the laptop) and turning it back on WILL fix the problem.

2023-04-19

~1 year to end of EPEL-7

Extra Packages for Enterprise Linux 7

Extra Packages for Enterprise Linux (EPEL) are packages based off of various Fedora releases but built for the various distributions based off of Red Hat Enterprise Linux. In June of 2014, Red Hat Enterprise Linux 7 (EL-7) was released and over the next several months, a focus was made to make the release of EPEL possible. Much of the work was done by Kevin Fenzi and Dennis Gilmore with some additional work by anyone else who had spare time. The initial goal was to make it that core packages needed for Fedora Infrastructure to move its core servers to EL-7 were built. That had been what had been done for the initial releases of EPEL-5 and EPEL-6, and would allow for enough base 'packages' to be built for additional packages to be added by other maintainers.

In comparison to trying to get EPEL-5 working, building for EPEL-7 was fairly easy. The initial distribution came with a large set of shipped development libraries and tooling versus getting added later. Over the years, the EL-7 distribution also gained various newer gcc toolkits via software collections which also helped EPEL maintainers to keep updating software for much of the 10 year lifecycle of EL-7. However, this maintenance has been getting harder over the last 2 years, as more and more software required either newer kernels, glibc, or other libraries that aren't available for an older operating system. [This is similar to what happened with EPEL-5 and EPEL-6 where the last year or so of the repository was more and more packages being removed due to maintenance concerns.]

This ties in with the general end of support for Red Hat Enterprise Linux 7 on June 30, 2024. While final plans for how EPEL-7 will be end of lifed, this is a general outline from how EPEL-5 and EPEL-6 were similarly ended.

  1. There will be regular reminders on mailing lists that the project will no longer be supporting EL-7 after a specified date. 
  2. On that date, the following will happen:
    1. The Fedora build system will not allow any more EPEL-7 builds
    2. A final push of all updates will happen to /pub/epel/7/
    3. The current items in /pub/epel/7/ will be archived over to /pub/archive/epel/7.final/
    4. Symbolic links will be made to point /pub/archive/epel/7 to the 7.final
    5. The mirrormanager program which is what yum uses to look for updates will change where it points to to /pub/archive/epel/7/
    6. After a week to allow mirrors to catch up, /pub/epel/7/ will be removed and a line telling people where to find the archived content.
  3. Updates to lists and such explaining what happened will occur.

Why A Year Plus Notice?

EPEL-7 is the largest release that the Fedora project supports. There are about 400,000 Fedora systems seen by countme, and somewhere between 3.4 million and 6.7 million EPEL-7 users (depending on how looks at mirrormanager statistics). Going from the long tail turn off of EPEL-5 and EPEL-6 systems over the years, many of those EPEL-7 systems will take years to move to later releases. Going from past reports, many of the system administrators are not the original admins who set up the machine, and don't even know the OS or its auxiliary repositories like EPEL are no longer updated. Putting up blog posts like this can help:

  • Give admins notice and a case to their management to do updates BEFORE the end of life date.
  • A heads up on why scripts that mirrored content from /pub/epel/7/ will no longer work.
  • Time to mirror the content locally for the inevitable reinstalls because management don't think an update to a newer release is needed.  

Whatever the case, good luck to you fellow system administrators.

2022-04-01

Compiling openldap for CentOS 8 Stream

Compiling OpenLDAP for EL8 systems

Steps to compile openldap-server for CentOS 8 Stream

The EL8 release did not ship an openldap-server like it did in previous releases. Instead only the client tools and some libraries are included for existing applications. Instead the focus from the upstream provider has been on other LDAP solutions.

This leaves a problem for various sites who have their data in an OpenLDAP system and do not have the time, energy, resources for moving to something else. There are several possible solutions to this:

  1. Continue to use EL5/EL6 even though it is at end of open maintenance.
  2. Continue to use EL7 until it is end of open maintenance around 2024-06-30.
  3. Move to a different distribution which does have working openldap
  4. Compile replacement tools using the Fedora src.rpm which may be closer to the ‘upstream’.
  5. Compile replacement tools using the upstream source.
  6. Compile using the upstream source from https://git.centos.org
  7. [Added after initial post] You can download them from https://koji.mbox.centos.org/koji/

In this tutorial we will work with number 5. At the end we will cover number 6.

Setting up a build environment.

For simplicity sake, we will assume you have a working but minimally installed Fedora 35 or EL8 system (Alma, Oracle, Rocky, etc) which you can do compiles in. If we are using an EL8 system are going to need to get mock and git installed.

$ sudo dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm

For Fedora and EL8 systems the following should work the same:

$ sudo dnf install git mock rpm-build
$ sudo usermod -a -G mock $USERNAME
$ newgrp mock

Answer yes to the questions about adding new keys and the packages should be installed to allow for a build to occur. We now need to set up a minimal .rpmmacros file for the next steps:

# uncomment if you want to build in standard homedirectory
#%_topdir %(echo $HOME)/rpmbuild
# comment if want to use standard home directory
%_topdir        %{getenv:PWD}
%_sourcedir     %{_topdir}/SOURCES
#%_sourcedir     %{_topdir}/SOURCES/%{name}-%{version}
%_specdir       %{_topdir}/SPECS
%_srcrpmdir     %{_topdir}/SRPMS
%_builddir      %{_topdir}/BUILD

%__arch_install_post \
    [ "%{buildarch}" = "noarch" ] || QA_CHECK_RPATHS=1 ; \
    case "${QA_CHECK_RPATHS:-}" in [1yY]*) /usr/lib/rpm/check-rpaths ;; esac \
    /usr/lib/rpm/check-buildroot

Once we have that in place, the following will get an openldap build going:

$ mkdir -vp ~/EL8-sources/ ~/output-packages/
$ cd ~/EL8-sources/
$ git clone https://git.centos.org/rpms/openldap.git
$ git clone https://git.centos.org/centos-git-common.git
$ cd openldap
$ ../centos-git-common/get_sources.sh
$ rpmbuild -bs SPECS/openldap.spec

Now depending on the host OS you are doing this on, you should see a file like SRPMS/openldap-2.4.46-18.fc35.src.rpm or SRPMS/openldap-2.4.46-18.el8.src.rpm having been created.

$ mock -r centos-stream+epel-next-8-x86_64 --chain --localrepo \
~/output-packages/ SRPMS/openldap-2.4.46-18.fc35.src.rpm

should then attempt to build the packages and will end up with a fully usable repo in ${HOMEDIR}/output-packages/results/centos-stream+epel-next-8-x86_64

If not, then there are probably some steps or problems I missed in this howto :(. At this point you can determine what to do with installing this -server package on the server needing it.

Downloading direct from CentOS.

This is the ‘feed the fisherman versus teaching how to fish’ part of the document.

If you are using CentOS Stream 8, you can download the build packages from the project koji. I expect similar steps can be done for other rebuilds.

  1. dnf list openldap to get which package you are looking for.
  2. Open a window to https://koji.mbox.centos.org/koji/
  3. Type in openldap in the Search box.
  4. Click on the build you would have installed. For this example, we will choose https://koji.mbox.centos.org/koji/buildinfo?buildID=18688 and then scroll down to the architecture you are using.
  5. Right click on the download button for openldap-servers like:https://koji.mbox.centos.org/pkgs/packages/openldap/2.4.46/18.el8/x86_64/openldap-servers-2.4.46-18.el8.x86_64.rpm
  6. Install this package in the package place you want.
  7. When dnf breaks because it can’t upgrade the package due to the upstream updating, go follow step 0 again.

2022-02-28

Dealing with RAID arrays

Dear Future Self,

 We have come to another letter where we are going to better document something PastSelf thought it knew, but clearly didn't. In this case we are going to start recovering from a RAID array after a reinstall. For reasons we won't get into, PastSelf had to reinstall the home server for the 2nd time this week. [Let us just say that PastSelf is no longer allowed to use sudo without supervision and move on.] In the reinstall, we could not get the /dev/sdb and /dev/sdc RAID array to be fully recognized and realized that we had also made the original ones too small for what we needed [which is what started the whole problem when we tried to grow a partition but forgot that the external backup always becomes /dev/sda for some reason and /dev/sdb was not the RAID drive but the / drive. Live and learn, live and learn.]

Due to some bad signatures we needed to clear the drives of their current data. This was done by booting from a USB stick (which also becomes /dev/sda in this hardware.... wtf?) and clearing each drive of its signatures. 

# wipefs -a /dev/sdb
# wipefs -a /dev/sdc
# wipefs -a /dev/sdd
# cat /proc/mdstat 
Personalities : 
md127 : inactive sdc1[1](S)
      1464851456 blocks super 1.2
       
unused devices: 

  

The above failed because the kernel and boot had tried to make them part of a RAID array /dev/md127 but was not able to sync them. I was also unable to

mdadm --stop /dev/md127
for some reason. At this point, PastSelf further broke his oath of primum non nocere by using dd on each of the disks.
# dd if=/dev/zero of=/dev/sdb bs=1024 count=1000000
# dd if=/dev/zero of=/dev/sdc bs=1024 count=1000000
# dd if=/dev/zero of=/dev/sdd bs=1024 count=1000000
A reboot and going into rescue mode still showed that some signatures were there which I realized was due these disks being formatted with GPT and being much more capable of surviving stupidity. However mdadm --stop now worked so I could use gdisk on the drives. I then reinstalled a minimal Alma8.5 onto the box and then did a manual creation of the RAID array:
# gdisk /dev/sdc
GPT fdisk (gdisk) version 1.0.3

Partition table scan:
  MBR: not present
  BSD: not present
  APM: not present
  GPT: not present

Creating new GPT entries.

Command (? for help): n
Partition number (1-128, default 1): 1
First sector (34-3907029134, default = 2048) or {+-}size{KMGTP}: 
Last sector (2048-3907029134, default = 3907029134) or {+-}size{KMGTP}: 
Current type is 'Linux filesystem'
Hex code or GUID (L to show codes, Enter = 8300): fd00
Changed type of partition to 'Linux RAID'

Command (? for help): w

Final checks complete. About to write GPT data. THIS WILL OVERWRITE EXISTING
PARTITIONS!!

Do you want to proceed? (Y/N): y

  

At this point we were able to get the system ready for creating the RAID partition.

# mdadm --create --verbose /dev/md1 --level=1 --raid-devices=2 /dev/sdb1 /dev/sdc1 --force
mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
    --metadata=0.90
mdadm: /dev/sdc1 appears to be part of a raid array:
       level=raid1 devices=2 ctime=Thu Dec 30 18:54:28 2021
mdadm: size set to 1953381440K
mdadm: automatically enabling write-intent bitmap on large array
Continue creating array? y
mdadm: Fail to create md1 when using /sys/module/md_mod/parameters/new_array, fallback to creation via node
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md1 started.
[root@xenadu ~]# cat /proc/mdstat 
Personalities : [raid1] 
md1 : active raid1 sdc1[1] sdb1[0]
      1953381440 blocks super 1.2 [2/2] [UU]
      [>....................]  resync =  0.7% (15491456/1953381440) finish=158.1min speed=204199K/sec
      bitmap: 15/15 pages [60KB], 65536KB chunk

unused devices: <none>
# mdadm --detail --scan
ARRAY /dev/md1 metadata=1.2 name=xenadu.int.smoogespace.com:1 UUID=c032f979:e8e4deda:a590ca5d:820a8548
# mdadm --detail --scan > /etc/mdadm.conf
# echo '/dev/md0 /srv xfs defaults 0 0' >> /etc/fstab

Now wait for the sync to be done, and then start the restore from backups... you know the ones that Past-PastSelf made just in case of this situation. Also Future-Self, could you please write up some ansible playbooks to do this from now on? Future-FutureSelf will appreciate it.

Yours Truly, PastSelf