1Optimizing Citrix NetScaler and servicesAbout:Revision:Version 1.001 (11/02/2016)Reviewers:Carl Behrent 24Dave Brett ( Stalhood ( the author:Marius Sandbu works as a Senior Systems Engineer at Exclusive Networks inNorway, where he focuses on software-defined datacenter, end-user computingand cloud technologies. He is a Microsoft Azure MVP and the author of,Implementing NetScaler VPX and Mastering NetScaler VPX.He can be contacted on Twitter @msandbu or on his email [email protected]’s blogs at http://msandbu.wordpress.comInformation about this eBook:This short eBook covers some basic configuration of a NetScaler VPX, both in anon premise environment and in Microsoft Azure. This eBook is aimed at system &Citrix administrators, which are already familiar with Citrix NetScaler and wish to beable to tune/tweak NetScaler and know more about using the different networkingsettings.Any feedback can be directed to my email [email protected]: that the information presented in this eBook is based on NetScaler version11.0, expect the part containing Microsoft Azure deployment, which is still running10.51

2Optimizing Citrix NetScaler and servicesContentsAbout: . 1Background. 3Tuning NetScaler in a virtual environment . 4Licensing . 4CPU Sizing . 5Memory Sizing . 7Firmware upgrades. 7NIC Teaming and LACP . 7VLAN tagging . 8Jumbo Frames . 9NetScaler deployment in Azure. 9NetScaler packet flow. 14TCP Profiles . 14SSL Profiles . 20VPX SSL limitations . 22Mobilestream . 22Compression . 23Caching . 25Front-end optimization . 27HTTP/2 & SPDY . 29Tuning for ICA Traffic . 302

3Optimizing Citrix NetScaler and servicesBackgroundAfter working with NetScaler for a few years now, I’ve seen that so many have yetto grasp the full feature set that NetScaler actually offers. So many features andsimple tuning that can make a big difference in terms of performance.NetScaler is a key component in many environments, like for instance hugee-commerce websites which depend on having a low response time, highthroughput connections to ensure that their end-users are able to buy the productsthey need.Also when deploying in virtual environments, more factors apply on how well theNetScaler is going to perform.As an example when setting up a NetScaler VPX in a simple virtual environmentusing all the default settings from Citrix. It is not using the recommended SSL settingsIt is using a default TCP profile which is not optimized for performanceThe virtual appliance is not scaled depending on license you requireHTTP profiles are not adjusted accordinglySo this eBook is to give a bit more overview and on some features a bit moredeep-dive on what you should take a closer look at when tuning your NetScalerenvironment. Also you need to remember that the network stack that you have,NetScaler has many features that can adjust multiple layers on the networking stackDataSegmentFramesSSL CiphersHTTP Protocol versionDataHTTP ProtocolApplication layerSSL / TLSSSL CiphersTCPHTTP Protocol versionHeaderDataTCP/UDPTransport layerPacketIPHeaderSSL CiphersTCPHTTP Protocol versionHeaderDataIPInternet layerEthernetheaderIPMTUHeaderSSL CiphersTCPHTTP Protocol versionMSSHeaderDataNetworkNetworkFor instance, we have SSL Profiles and http profiles which can adjust performanceon the application layer, and we also have TCP profiles which can adjust how theTCP connection should behave. We also have settings like Jumbo frames and LACPwhich work on the lower layers of the stack.3

4Optimizing Citrix NetScaler and servicesTuning NetScaler in a virtual environmentIf we are to deploy a NetScaler in a virtual environment, it is crucial that weconfigure the virtual appliance properly. For instance, it is important that weproperly allocate vCPU, since NetScaler uses the CPU to handle all network trafficsuch as SSL encryption, Application Firewall, content switching, compression, andso on. While physical NetScaler appliances perform SSL processing in a dedicatedASIC, NetScaler VPX is not so fortunate and instead performs SSL processing inmain CPU. Compression can also consume significant CPU.NetScaler VPX supports the following hypervisors: Citrix XenServer 6.2 and 6.5;VMWare ESX, Microsoft Hyper-V Server 2012 and 2012 R2, KVM Linux – (FedoraCore 20, Ubuntu 14.10) NetScaler VPX is also supported on Azure and AmazonAWS. In addition, a NetScaler 1000V virtual appliance runs on Cisco Nexus 1100.By default, the appliance template downloaded from Citrix is setup to use theminimum system requirements of 20 GB of disk, 2 vCPU, 2 GB of memory and twovNICs.When NetScaler is running in a virtual environment it is important to remember theperformance numbers that is been validated and supported by Citrix on the VPXappliance. SSL transaction/sec (2K key certificates) up to 750SSL throughout, Gbps up to 1.0Compression throughput, Gbps up to 0.75SSL VPN/ICA proxy concurrent users up to 1500Note: these numbers have been taken from the VPX datasheet and might change oradjusted over time. The original datasheet can be found here x-data-sheet.pdfLicensingThe system throughput is based upon the type of license we purchase. NetScaler VPX isavailable in five different editions: VPX-10, VPX-200, VPX-1000, VPX-3000 and VPX-8000where the number behind VPX- stands for the maximum licensed Mbps throughput.All NetScaler VPX licenses are available in the same editions as the physical appliances.This includes: Standard Edition (Load Balancing), Enterprise Edition (web applicationacceleration, monitoring), and Platinum Edition (caching). There’s also a NetScaler4

5Optimizing Citrix NetScaler and servicesGateway VPX appliance that only does NetScaler Gateway (no Load Balancing) but islicensed for 50 Mbps.The system throughput is based upon the type of license we choose, NetScaler VPX isavailable in five different editions. VPX-10, VPX-200, VPX-1000, VPX-3000 and VPX-8000where the number behind VPX- stands for the about of throughput in terms of Mbpsnon-SSL, HTTP traffic) it can handle.Citrix also has a couple free NetScaler VPX licenses: NetScaler VPX Express – includes NetScaler Standard Edition features, limited to 5Mbps, and must be renewed annually.NetScaler VPX Developer Edition – includes NetScaler Platinum Edition featuresand is limited to 1 Mbps. This “free” license is available to all current NetScalercustomers.Note: NetScaler VPX license files are allocated to the MAC address of the virtual machine.If the MAC address changes then the license file is invalidated. This behavior is typicallyseen on Hyper-V so make sure a static MAC address is allocated to the virtual machineThe throughput is enforced only for traffic inbound to the NetScaler only, regardless ofwhether this is request traffic or response traffic. So for instance, a VPX-1000 can processboth 1 Gbps of inbound traffic and 1 Gbps of outbound traffic at the same time. It ishowever important to remember that traffic going back to a NetScaler is also consideredinbound traffic.In regards to CPU, it is important that we configure the amount ofvCPU based upon the VPX model we have. NetScaler uses something called packetengines, which are each assigned their own vCPU, which does the network processing.In a basic setup with two vCPUs, the first CPU is used for management tasks and thesecond vCPU is used to process all network traffic and handling all the different features.CPU Sizingit is important that we configure the number ofvCPUs based upon the VPX model we have. NetScaler uses something called packetengines. In a basic setup with two vCPUs, the first CPU is used for managementtasks and the second vCPU is used to process all network traffic and handling all thedifferent features.VPX 10 and 200 only support only one packet engine CPU, meaning a total of twovCPUs, while for instance VPX 1000 supports having two or three packet enginevCPUs. This allows NetScaler to distribute traffic between the different vCPUsand will allow for better performance and distribution of the traffic between the vCPUs.5

6Optimizing Citrix NetScaler and servicesThe following chart shows the different editions and support for multiple packet engines.MemoryLicense2 GB 4 GB 6 GB 8 GB 10 GB 12 GBVPX-10111111VPX-200111111VPX-1000 123333VPX-3000 123333VPX-8000 123455The number of vCPUs the number of PEs 1. For example, if you have installed a VPX3000 license and 6 GB memory is available, to add three PEs you must allocate fourvCPUs. If we are running an older version of Hyper-V, this feature might not be supportedand we should in that case upgrade to the latest version of Hyper-V to get that support.If you allocate more than the licensed number of vCPUs, then the extra vCPUs will be igno1705red and not used as packet engines.On NetScaler VPX, all SSL operations are performed on main CPU. Higher single-core CPUclock speeds will increase SSL throughput. To increase the number of vCPUs available forSSL operations you’ll need VPX-1000 or higher license. Note: Physical NetScaler MPX andNetScaler SDX appliances have dedicated SSL ASICs so they have much higher SSLthroughput.For maximum throughput on NetScaler VPX, use your hypervisor management tools toreserve 100% of the CPU allocated to NetScaler. Also when doing the initial setup do notenable all features, only enable the features you need. Enabling all features will impact theperformance of the NetScaler appliance.CPU Usage on the management CPU and packet CPUs can be seen on the NetScalerusing the CLI commandstat system6

7Optimizing Citrix NetScaler and servicesOther Hypervisor Notes: NetScaler VPX does not support hypervisor features like SRV-IO or PCI devicepass-through.Any hypervisor integration tools on NetScaler VPX should not be updatedmanually, since Citrix will update them automatically in newer firmware upgrades.NetScaler VPX should not be migrated between hypervisor hosts unless absolutelynecessary. For example, disable automatic VMware vMotion of NetScaler VPXappliances.Memory SizingNetScaler VPX defaults to 2 GB of RAM, which is sufficient for most packet engineoperations. However, the following NetScaler features require more RAM: Integrated CachingApplication FirewallWeb Interface on NetScalerWebFrontFirmware upgradesCitrix often releases new firmware upgrades, which can be downloaded from their website.It is important to properly read the release notes before during the upgrade process toensure that there are no major changes that might affect your existing NetScaler solutionand to check if there are any particular bug fixes or security vulnerabilities fixes.Firmware upgrades can also include upgrades to the virtual guest tools which are used bythe hypervisor as well, so in case might include support for a newer version of thehypervisor.The following knowledge article has information about the different release versions anddate should also pay close attention to the support page, which has knowledge articlesand security bulletin information on NetScaler. The page can be found here NetScalerNIC Teaming and LACPBy default, NetScaler VPX comes with two vNICs. If both vNICs are connected to the sameVLAN then one of them should be disabled. If both are enabled, then NetScaler will useboth interfaces in a round-robin fashion meaning that the MAC address of the SNIP willconstantly change depending on which interface is being used for that particular packet.7

8Optimizing Citrix NetScaler and servicesNetScaler VPX supports up to 10 vNICs, depending on hypervisor capabilities.If you need Link Aggregation or NIC teaming, it is best to do that at the hypervisor level.We need to understand how the various types of NIC teaming perform traffic processing.Most vendors have good documentation on their NIC teaming features. For example,Microsoft has documented all the different options ls.aspx?id 30160.For instance, Microsoft Hyper-V has a form of NIC teaming called switch independentmode that allows us to connect a physical host to different switches and does not requireus to do any form of configuration on the switches. In this type of NIC teaming, theNetScaler VPX vNIC is bound to only one uplink and its throughput is limited to that oneuplink. Important to remember for Hyper-V that we choose HypervPort distributionalgorithm.To go beyond a single uplink, you need a packet hashing algorithm and the switch mustbe configured to support the hashing. This is typically configured using LACP, which allowsfor aggregation of bandwidth (incoming/outgoing) and redundancy of NICs, and withsetting up MLAG between switches, we can also ensure switch redundancy.Note: If deploying NetScaler VPX on a Hyper-V environment, make sure that the host NICdrivers are running the latest version, in many cases there have been known issues withVMQ (Virtual Machine Queue) with Broadcom chips which affects the performance badlyon virtual machines.VLAN taggingFor VLAN tagging, you can either do it at the hypervisor level or you can do it inside theNetScaler VPX. At the hypervisor level, you create a VM network (e.g. VMware Port Group)and assign the VLAN tag to that VM network. The hypervisor adds the VLAN tag to allpackets egressed from the NetScaler VPX and there’s no need to configure any VLANtagging inside the NetScaler VPX. However, this limits the NetScaler to only one VLAN pervNIC.Alternatively, you can configure the hypervisor VM network to not touch VLAN tags andlet the virtual machine and physical switch handle it. This is sometimes called VLANtrunking. In this case, the NetScaler VPX is responsible for adding all necessary VLAN tags.If the NetScaler VPX is connected to multiple VLANs, make sure you configure VLANsinside the NetScaler whether the NetScaler is tagging them or not. NetScaler needs VLANconfiguration so it knows which subnets go on which network interfaces. Typically, youcreate a SNIP for each VLAN, create a VLAN object for each VLAN, and bind the VLANobject to the SNIP and Network Interface. The VLAN object can be either tagged oruntagged. If the hypervisor is doing the tagging, then leave the VLAN object as untagged.8

9Optimizing Citrix NetScaler and servicesIf the hypervisor VM network is in VLAN trunking mode, then the NetScaler VLAN objectsprobably need VLAN tagging enabled.Note: VLAN tagging inside the virtual appliance is only supported on ESX and XenServer.Jumbo FramesNetScaler VPX supports the use of Jumbo frames, which allows the appliance to processEthernet frames with up to 9000 bytes of payload, which allows it to transfer larger filesmore efficiently than it is possible with the standard MTU size of 1500 bytes. Jumbo frameshave the potential to reduce overhead and CPU cycles on the appliance.Jumbo Frames requires that the infrastructure it is connected to has been configured tosupport Jumbo Frames as well. Internet connection does not support Jumbo frames sincemost routers only support the standard MTU, so Jumbo frames are restricted to run onlywithin the datacenter.Note: Jumbo frames are only supported in NetScaler when it is running on ESX or KVM,and it requires host NIC configuration so that the host NIC MTU is aligned with theapplianceThe appliance can operate with jumbo frames in the following scenarios: Jumbo to Jumbo. The appliance receives data as jumbo frames and sends it asjumbo frames.Non-Jumbo to Jumbo. The appliance receives data as regular frames and sends itas jumbo frames.Jumbo to Non-Jumbo. The appliance receives data as jumbo frames and sends itas regular frames.Jumbo Frames (Maximum Transmission Unit Size) is configured at the network interfaceand in TCP profiles. To change the MTU for an interface, go to System Network Interfaces. Edit an interface and set the MTU.For TCP based traffic, the default TCP profile will override the interface value we define forJumbo frames. The default profile nstcp default profile is bound to all TCP based loadbalancing services and in this case, we should change the MSS value within the TCP profileif we want to leverage Jumbo frames. You can change the default TCP settings at System Settings Change TCP Parameters. Or go to System Profiles and edit thenstcp default profile.NetScaler deployment in AzureWhen deploying Citrix NetScaler from Microsoft Azure its needs to be created using theAzure Marketplace, which consists of a custom firmware from Citrix. Since the networking9

10Optimizing Citrix NetScaler and servicescapabilities in Azure are still a bit restricted, the Marketplace appliance will have thefollowing limitations. Restricted to a single IP address (shared between the NSIP, SNIP and VIP)Following ports, which cannot be used for services (Ports 21, 22, 80, 443, 8080, 67,161, 179, 500, 520, 3003, 3008, 3009, 3010, 3011, 4001, 5061, 9000, and 7000.)Follo