Redhat Openstack has build-in pacemaker to manage few docker containers status, and it also affects how Mariadb works on Openstack. Usually when you see a Mariadb failure on Redhat Openstack, you would see some thing like this: [[email protected] etc]# pcs status Cluster name: tripleo_cluster Stack: corosync Current DC: controller1 (version 1.1.19-8.el7_6.2-c3c624ea3d) - partition with quorum Last updated: Thu May 7 21:55:43 2020 Last change: Thu May 7 21:51:15 2020 by hacluster via crmd on controller2 12 nodes configured 36 resources configured Online: [ controller1 controller2 controller3 ] GuestOnline: [ [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] ] Full list of resources: Docker container set: rabbitmq-bundle [10.

Continue reading

OVS traffic capture OVS traffic flow illustration(kolla example): traffic to go out of cloud via provider network VM –> tap+qbr(linuxbridge)+qvb –> qvo+br-int+int-br-ex –> phy-br-ex+br-ex+br_vlan –> external network traffic to go to vxlan tenant VM –> tap+qbr(linuxbridge)+qvb –> qvo+br-int+patch-tun –> patch-int+br-tun+port vxlan# –> remote host vxlan if ip if no DVR used, then all traffic will go to neutron nodes from compute nodes then use neutron nodes' port# to go out.

Continue reading

How to - Ceph - Identify the server drive bay number of a faulty drive To identify a faulty disk is in which drive bay: Method 1 - Using iLO and iDRAC Login to the iLO or iDRAC interface Check for error messages in the iLO or iDRAC. If iLO (HP), from the main page, go to Information → System Information → Storage → Physical View

Continue reading

How To Replace Ceph Osd

How to - Ceph - Configure Ceph on a new drive source: https://ceph.com/geen-categorie/admin-guide-replacing-a-failed-disk-in-a-ceph-cluster/ Remove the OSD of the faulty drive If you are replacing a faulty drive with a new one, you will need to remove the OSD of the faulty drive before proceeding with creating the new OSD. *Requirement: The faulty SSD must have been replaced with a healthy SSD. Login to the Ceph node with the faulty drive.

Continue reading

Mariadb in a Galera Cluster Maintenance and Recovery¶ Introduction¶ This document covers how to perform system maintenance with the Mariadb database in active production, and how to recover from power outages, or network failure. Environment¶ SCM (Scyld Cloud Manager) currently leverages the Kolla OpenStack project which packages the OpenStack services into Docker containers. The mariadb database utilizes galera to run a database cluster on the three OpenStack controller systems. The cluster provides high availability as well as scalability.

Continue reading

Openstack Magnum

Magnum is the container cluster orchestration tool for Openstack, it uses Heat to deploy and monitor. The actual workflow would be: Python script to load cluster request –> Inject into Heat templates –> Start building VM –> run conditional actions in Shell –> build all nodes. Prerequisite Few setup need to be done before using Magnum: Node image needs to have property ‘os_distro’ set, fedora requires os_distro=fedora-atomic and coreos needs os_distro=coreos.

Continue reading

Openstack Octavia

All config and cmd in this blog has been verified and tested against Queens release Considering Neutron LBaaS has been replaced by Octavia and marked as depreciated since Queens, I think it’s time to write a brief blog about Octavia. LB is the key to many app services running on Openstack, and it’s critical for K8s environment as it’s the only ingress endpoint for a exposed service. Let’s firstly talk about the issues and weakness that current LBaaS has:

Continue reading

转自int32bit blgo OpenStack高级特性简介 1. 虚拟机软删除 通常情况下,当用户删除虚拟机时,虚拟机会立即从hypervisor底层删除,不可撤回。为了防止人为误操作,Nova支持开启软删除(soft delete)功能,或者称为延迟删除,延迟删除时间通过Nova配置项/etc/nova/nova.conf的reclaim_instance_interval项指定,如下: [DEFAULT] ... reclaim_instance_interval = 120 此时虚拟机执行普通删除操作时,Nova不会立即删除虚拟机,而是会等待两分钟的时间,在此时间间隔内,管理员可以随时恢复虚拟机,只有在超过120秒后虚拟机才会真正执行删除操作,不可恢复。 为了演示该功能,我们删除一台虚拟机int32bit-test-2: # nova list +--------------------------------------+-----------------+--------+------------+-------------+-------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+-----------------+--------+------------+-------------+-------------------+ | 8f082394-ffd2-47db-9837-a8cbd1e011a1 | int32bit-test-1 | ACTIVE | - | Running | private=10.0.0.6 | | 9ef2eea4-77dc-4994-a2d3-a7bc59400d22 | int32bit-test-2 | ACTIVE | - | Running | private=10.0.0.13 | +--------------------------------------+-----------------+--------+------------+-------------+-------------------+ # nova delete 9ef2eea4-77dc-4994-a2d3-a7bc59400d22 Request to delete server 9ef2eea4-77dc-4994-a2d3-a7bc59400d22 has been accepted.

Continue reading

转自int32bit blog OpenStack中那些很少见但很有用的操作 Glance member 镜像是OpenStack中非常重要的数据,保存了用户的操作系统数据,因此保证镜像的安全性非常重要。通常上传到Glance的镜像可见性(visibility)可以设置为public和private,public的镜像表示对所有的租户可见,而private镜像只有租户自己以及管理员可见。在新版本的Glance中,引入了一种新的可见性状态–shared,该状态的镜像允许设置共享给指定的一个或者多个租户。共享的租户我们称为member,我们只需要把租户加到镜像的member中就可以访问其它租户的镜像了。 在测试环境下实验下,首先我们在admin租户下创建一个镜像如下: glance image-create --disk-format raw --container-format bare --name cirror-3.0 --file cirros-3.0.img 在demo租户下该镜像不可见: $ source openrc_demo $ glance image-list +----+------+ | ID | Name | +----+------+ +----+------+ 我们把demo租户加到镜像member中: $ glance member-create ec5426f5-ab4d-43a6-a1e1-5a1919aa7bea fb498fdd27e74750a6b209158437696c +--------------------------------------+----------------------------------+---------+ | Image ID | Member ID | Status | +--------------------------------------+----------------------------------+---------+ | ec5426f5-ab4d-43a6-a1e1-5a1919aa7bea | fb498fdd27e74750a6b209158437696c | pending | +--------------------------------------+----------------------------------+---------+ admin这边把demo加入到member中,还需要demo这边确认,把member状态更新为accepted,表示接收共享的镜像: $ glance member-update ec5426f5-ab4d-43a6-a1e1-5a1919aa7bea fb498fdd27e74750a6b209158437696c accepted +--------------------------------------+----------------------------------+----------+ | Image ID | Member ID | Status | +--------------------------------------+----------------------------------+----------+ | ec5426f5-ab4d-43a6-a1e1-5a1919aa7bea | fb498fdd27e74750a6b209158437696c | accepted | +--------------------------------------+----------------------------------+----------+ 此时在demo租户下可以看到共享的镜像了:

Continue reading

转自int32bit blog如何阅读OpenStack源码 OpenStack基础 OpenStack组件介绍 OpenStack是一个IaaS云计算平台开源实现,其对标产品为AWS。最开始OpenStack只有两个组件,分别为提供计算服务的Nova以及提供对象存储服务的Swift,其中Nova不仅提供计算服务,还包含了网络服务、块存储服务、镜像服务以及裸机管理服务。之后随着项目的不断发展,从Nova中根据功能拆分为多个独立的项目,如nova-volume拆分为Cinder项目提供块存储服务,nova-image拆分为Glance项目,提供镜像存储服务,nova-network则是neutron的前身,裸机管理也从Nova中分离出来为Ironic项目。最开始容器服务也是由Nova提供支持的,作为Nova的driver之一来实现,而后迁移到Heat,到现在已经独立为一个单独的项目Magnum,后来Magnum的愿景调整为主要提供容器编排服务,单纯的容器服务则由Zun项目接管。最开始OpenStack并没有认证功能,从E版开始才加入认证服务Keystone。 目前OpenStack基础服务组件如下: Keystone:认证服务。 Glance:镜像服务。 Nova:计算服务。 Cinder:块存储服务。 Neutorn:网络服务。 Swift:对象存储服务。 E版之后,在这些核心服务之上,又不断涌现新的服务,如面板服务Horizon、编排服务Heat、数据库服务Trove、文件共享服务Manila、大数据服务Sahara、工作流服务Mistral以及前面提到的容器编排服务Magnum等,这些服务几乎都依赖于以上的基础服务。比如Sahara大数据服务会先调用Heat模板服务,Heat又会调用Nova创建虚拟机,调用Glance获取镜像,调用Cinder创建数据卷,调用Neutron创建网络等。 目前最新发布的版本为第15个版本,代号为Pike,Queens版本已经进入快速开发阶段。 OpenStack服务越来越多、越来越复杂,覆盖的技术生态越来越庞大,宛如一个庞然大物,刚接触如此庞大的分布式系统,都或多或少感觉有点如”盲人摸象”的感觉。不过不必先过于绝望,好在OpenStack项目具有非常良好的设计,虽然OpenStack项目众多,组件繁杂,但几乎所有的服务骨架脉络基本是一样的,熟悉了其中一个项目的架构,深入读了其中一个项目源码,再去看其它项目可谓轻车熟路。 本文章会以Nova项目为例,一步一步剖析源码结构,希望读者阅读完之后再去看Cinder项目会是件非常轻松的事。 工欲善其事必先利其器 要阅读源代码首先需要安装科学的代码阅读工具,图形界面使用pycharm没有问题,不过通常在虚拟机中是没有图形界面的,首选vim,需要简单的配置使其支持代码跳转和代码搜索,可以参考GitHub - int32bit/dotfiles: A set of vim, zsh, git, and tmux configuration files。如图: OpenStack所有项目都是基于Python开发,并且都是标准的Python项目,通过setuptools工具管理项目,负责Python模块的安装和分发。想知道一个项目有哪些服务组成,最直接有效的办法就是找到入口函数(main函数)在哪里,只要是标准的基于setuptools管理的项目的所有入口函数都会在项目根目录的setup.cfg文件中定义,console_scripts就是所有服务组件的入口,比如nova(Mitaka版本)的setup.cfg的console_scripts如下: [entry_points] console_scripts = nova-all = nova.cmd.all:main nova-api = nova.cmd.api:main nova-api-metadata = nova.cmd.api_metadata:main nova-api-os-compute = nova.cmd.api_os_compute:main nova-cells = nova.cmd.cells:main nova-cert = nova.cmd.cert:main nova-compute = nova.cmd.compute:main nova-conductor = nova.cmd.conductor:main nova-console = nova.cmd.console:main nova-consoleauth = nova.cmd.consoleauth:main nova-dhcpbridge = nova.cmd.dhcpbridge:main nova-idmapshift = nova.

Continue reading

Author's picture

LuLU

Love coding and new technologies

Cloud Solution Consultant

Canada